How Data Brokers Profile, Segment, and Score Us

Data brokers are defined as businesses that knowingly collect and sell to third parties the personal information of a consumer with whom the business does not have a direct relationship. Unlike Big Tech firms like Meta and Google, which primarily collects our online activity, data brokers collect information about us from online and offline sources, thus surveilling us just as significantly. Their data sources include property records, purchase history, social media profiles, and online web and mobile app activity tracking. So, for example, data brokers know the websites you have visited (e.g., a website on depression), your credit card purchases (e.g., you purchased adult diapers), and the apps you have installed (e.g., a gay dating app or a Muslim prayer app). And their hooks in your mobile apps can even track all your locations (e.g., you visited a Planned Parenthood).[1]

Once data brokers have all this data, besides selling and sharing the raw information directly, they use technologies such as Artificial Intelligence (AI) to aggregate the data and draw inferences from it. This data processing is often facilitated using Big Tech's cloud computing platforms to crunch the data.

In this blog post, I will give some examples of what data brokers will do with our collected personal data (besides selling it in raw format). Namely how they profile, segment and score us. Some of this content is excerpted from my book Containing Big Tech which you can order here.

Dall-E: data brokers profiling, segmenting and scoring consumers


Profiling

Given the increasing number of data sources that data brokers can collect from given the explosion of mobile, IoT, etc., we can now see why a firm like Acxiom can advertise in 2021 that it collects up to 11,000-plus “data attributes” (also known as “data elements”) per person for 2.5 billion consumers. This is up from 3,000 data attributes and 700 million consumers in 2014.[2] And back in 2012, it was reported that Acxiom processed 50 trillion records per year, so one can easily imagine the volume of data now being processed is many times that.[3]

Even though a given data source may only provide relatively few data elements, as data brokers correlate and aggregate more data elements from disparate data sources, they can form a more detailed composite “profile” of a consumer and their life. Or, as EFF puts it, “These humble parts can be combined into an exceptionally revealing whole.” [4]

Data brokers not only collect “actual data elements” associated with a consumer from a wide variety of data sources but also create inferred or “derived data elements” that are predictive in nature that also become part of a consumer’s profile.[5] For example, researchers have shown that “sensitive personal attributes such as ethnicity, religious and political views, relationship status, sexual orientation, and alcohol, cigarette, and drug use can be quite accurately inferred from someone's Facebook likes.”[6] 

Another example is that a data broker might infer that a consumer is likely to default on a loan based on their past financial behavior.[7] And it is probable that the data broker Experian — who sells names of expecting mothers — likely infers knowledge of women being pregnant based on what purchases are made (e.g., a crib, a book on pregnancy, etc.), what websites are visited or web searches performed (e.g., “what to expect when expecting”), and social media posts that mention being pregnant.[8] 

Segmentation

Data brokers will then group and further profile consumers into “segments” or “categories” based on actual or derived data elements or a combination of the two. Some of the segments are populated based on “look-alike models” that predict a consumer’s behavior based on the past behavior of similar people.[9] 

Want to buy data on households segmented by characteristics including age, income, home ownership, presence of children, and spending patterns? A data broker has broken down US households into 26 segments (e.g., “Big Spender Parents” that are “dominated by middle-aged, traditional family households with children” with an average income of $207,000).[10] Experian classifies people into various financial categories like “Credit Hungry Card Switcher,” “Disciplined, Passive Borrower” and “Insecure Debt Dependent.”[11] And during the start of the Covid-19 pandemic, Experian created various “At-Risk Audience” segments based on their access to data about consumers defaulting on mortgages and filing for unemployment.[12]

Other data brokers add ethnicity or geography into the mix and offer segmentation focused on vulnerable low-income minority communities (e.g., “Ethnic Second-City Strugglers”) or by grouping together low-income families in rural areas (e.g., “Rural and Barely Making It”).[13] 

Some data brokers also create segments regarding highly sensitive health matters with categories such as “Expectant Parent,” “Diabetes Interest,” and “Cholesterol Focus.”[14] One privacy software company found that a data broker created categories of consumers with the following medical conditions:  bedwetting, bowel irregularity, trouble sleeping due to breathing, diet concerns, using adult diapers, cold sores, and using depression medications.[15]

Most consumers would probably not want to have sensitive personal data such as knowledge of their bed wedding available for sale. Note that segmenting consumers is not a new idea — a company called Claritas pioneered this concept in the 1970s with a product called PRIZM that defined groups of consumers based on demographics such as income.[16] But in today’s world of data brokers, no sensitive personal information is sacrosanct from segmentation.

Scoring

Data brokers also score us. Scores are predictions about consumer behavior based on actual and inferred data elements that software algorithms make to facilitate automated decision-making. Scoring provides insight into consumers by assigning a number or range that signifies the likelihood that a consumer will exhibit specific characteristics or perform certain actions.[17]

Most of us are familiar with the concept of credit scores that predict our likelihood of defaulting on a loan. Organizations use credit scores to decide whether to offer a consumer a mortgage or a credit card. They also determine your credit limit or the interest rate you receive on a loan or credit card.[18] 

Historically credit scores were based on an analysis of our financial transactions. But now, some companies score a consumer’s creditworthiness based on non-financial data such as social media postings or web searches. So, for example, “how someone fills out an online form or navigates on a website, the grammar and punctuation of one’s text messages, and the battery status on said individual’s phone” can, in theory, provide insight into calculating credit scores.[19]

But scoring has gone beyond its roots of analyzing creditworthiness or detecting fraudulent financial transactions. It now applies to all aspects of our lives. For example, certain data brokers can create scores that predict if a potential tenant will break a lease, absorb rent increases or pay the rent on time. Another data broker can manufacture a score regarding whether a potential employee should be hired based on “analyzing ‘tens of thousands of factors’ including a person’s facial expressions and voice intonations.”[20] Marketing data brokers can “lead score” you on your likelihood to buy a specific product. Your hold time for customer support can depend on your score as a profitable customer. Even your health is scored — based on your purchases, your ethnicity, and how much online shopping you do — that can influence what type of health insurance coverage you may get.[21]

Other examples of data broker scores include “The Pregnancy Predictor Score,” “The Charitable Donor Score,” and “The Medication Adherence Score.”[22] Scoring can also apply to an employee’s likelihood to join a union: it was reported in 2020 that Amazon was scoring entire Whole Foods stores to determine which of their stores had employees most likely to unionize.[23]

The Fair Credit Reporting Act (FCRA) gives consumers the right to view and challenge the data used to decide their credit scores. It also limits the use of consumer reports to defined permissible purposes. But outside of credit scores, all the other scoring that goes into automated decisions that impact our daily lives (e.g., getting a job, getting healthcare, etc.) is not regulated. One advocacy group wrote to the FTC in 2019:  “The existence of the scores is secret. The kinds of data being fed into the algorithms is secret. The source of that data is secret. The algorithm is secret. The score is a secret.”[24] Scoring is another example of how the entire data broker industry is built on a lack of transparency.

More Blog Posts on Data Brokers

This is one of many blog posts I have done on data brokers.  Check out the others, e.g., I have discussed the different types of data brokers, the sources from whence data brokers collect their data, the risks associated with data brokers (as narrated by John Oliver), and proposed laws at the Federal level and state levels (e.g., California and Texas).


FOOTNOTES

[1] NATO StratCom COE, “Data Brokers and Security,” 2021, https://stratcomcoe.org/cuploads/pfiles/data_brokers_and_security_20-01-2020.pdf and Federal Trade Commission, “Data Brokers: A Call for Transparency and Accountability,” 2014, https://www.ftc.gov/system/files/documents/reports/data-brokers-call-transparency-accountability-report-federal-trade-commission-may-2014/140527databrokerreport.pdf.

[2] Acxiom, Global Data Navigator datasheet, https://marketing.acxiom.com/rs/982-LRE-196/images/Fact_Sheet_Global_Data_Navigator.pdf,  and Federal Trade Commission, “Data Brokers: A Call for Transparency and Accountability, 2014.

[3] New York Times, Mapping, and Sharing, the Consumer Genome (June 16, 2012), https://www.nytimes.com/2012/06/17/technology/acxiom-the-quiet-giant-of-consumer-database-marketing.html.

[4] Bennett Cyphers and Gennie Gebhart, “Behind the One Way Mirror: A Deep Dive into the Technology of Corporate Surveillance,” Electronic Frontier Foundation (EFF), December 2, 2019, https://www.eff.org/wp/behind-the-one-way-mirror.

[5] Federal Trade Commission, “Data Brokers: A Call for Transparency and Accountability, 2014.

[6] Wolfie Christl, “Corporate Surveillance in Everyday Life,” Cracked Labs, June 2017, https://crackedlabs.org/en/corporate-surveillance.

[7] Aaron Riecke et al., “Data Brokers in an Open Society,” Open Society Foundations, November 2016, https://www.opensocietyfoundations.org/publications/data-brokers-open-society.

[8] Experian, “Life Event Marketing,” https://www.experian.com/marketing-services/life-event-marketing.

[9] NATO StratCom COE, “Data Brokers and Security,” 2021.

[10] Epsilon, “Niches 5.0 Brochure,” https://www.epsilon.com/us/insights/resources/niches-5.0-brochure.

[11] Bennett Cyphers and Gennie Gebhart, “Behind the One Way Mirror: A Deep Dive into the Technology of Corporate Surveillance,” Electronic Frontier Foundation (EFF), December 2, 2019.

[12] Shoshana Wodinsky, “Experian Is Tracking the People Most Likely to Get Screwed Over by Coronavirus,” Gizmodo, April 15, 2020, https://gizmodo.com/experian-is-tracking-the-people-most-likely-to-get-scre-1842843363.

[13] Federal Trade Commission, “Data Brokers: A Call for Transparency and Accountability, 2014. Justin Sherman, “Data Brokers are a Threat to Democracy,” Wired, April 13, 2021, https://www.wired.com/story/opinion-data-brokers-are-a-threat-to-democracy/.

[14] Federal Trade Commission, “Data Brokers: A Call for Transparency and Accountability, 2014.

[15] Atlas Privacy, “Does Starbucks Know If I Wet the Bed,” February 9, 2022, https://atlasprivacy.medium.com/does-starbucks-know-if-i-wet-the-bed-37a7d9a9487f.

[16] US Senate Committee on Commerce, Science, and Transportation, “A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes, Staff Report for Chairman Rockefeller,” December 18, 2013, https://www.commerce.senate.gov/services/files/0d2b3642-6221-4888-a631-08f2f255b577.

[17] US Senate Committee on Commerce, Science, and Transportation, “A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes, Staff Report for Chairman Rockefeller,” December 18, 2013.

[18] Consumer Financial Protection Bureau (CFPB), “What is a FICO Score?” September 4, 2020, https://www.consumerfinance.gov/ask-cfpb/what-is-a-fico-score-en-1883/.

[19] Wolfie Christl, “Corporate Surveillance in Everyday Life,” Cracked Labs, June 2017.

[20] Harvey Rosenfield and Laura Antonini, “Opinion: Data isn’t just being collected from your phone. It’s being used to score you,” Washington Post,  July 31, 2020, https://www.washingtonpost.com/opinions/2020/07/31/data-isnt-just-being-collected-your-phone-its-being-used-score-you/.

[21] Paul Boutin, “The Secretive World of Selling Data About You,” Newsweek, May 30, 2016, https://www.newsweek.com/secretive-world-selling-data-about-you-464789.

[22] Pam Dixon and Robert Gellman, “The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Future,” World Privacy Forum, April 2, 2014, http://www.worldprivacyforum.org/wp-content/uploads/2014/04/WPF_Scoring_of_America_April2014_fs.pdf.

[23] Hayley Peterson, “Amazon-owned Whole Foods is quietly tracking its employees with a heat map tool that ranks which stores are at most risk of unionizing,” Business Insider, April 20, 2020, https://www.businessinsider.com/whole-foods-tracks-unionization-risk-with-heat-map-2020-1.

[24] Represent Consumers, “June 24, 2019 Petition to the FTC, RE: Secret Surveillance Scoring: Urgent Request for Investigation and Enforcement Action,” https://www.representconsumers.org/wp-content/uploads/2019/06/2019.06.24-FTC-Letter-Surveillance-Scores.pdf.

Previous
Previous

The “Weaponization of Data” Threats Associated with Data Brokers

Next
Next

Live Event: The State of US Privacy & AI Regulation