Hereās something that might ruin your morning coffee: right now, there are companies making money by selling your personal informationāscraped directly from your GitHub profile, your npm packages, and your Stack Overflow answers. And theyāre not breaking any laws doing it.
Data brokers are in the business of aggregation. They crawl public sources, combine fragments of information into comprehensive profiles, and sell those profiles to recruiters, marketers, background check companies, and sometimes much shadier buyers. For developers, the problem is acute because our professional lives are inherently public. You canāt be a successful open-source contributor while being invisible.
Let me walk you through exactly how this works, what data theyāre collecting, and what you can actually do about it.
How Data Brokers Find Developer Profiles
Data brokers donāt hack anything. They donāt need to. They simply harvest whatās publicly available, at scale, using automated scrapers. Hereās where they look:
GitHub ā The Gold Mine
Your GitHub profile is a treasure trove for data aggregators:
- Profile information: Name, email, bio, location, company, website URL
- Commit history: Email addresses in git commits (even old ones you forgot about)
- Contribution graph: Activity patterns that reveal your schedule and timezone
- Repository metadata: Technologies you use, projects youāre involved in
- Organization memberships: Your employer, past and present
- Social connections: Who you follow, who follows you, collaborators
The big one people miss: git commit emails. Even if youāve set your GitHub profile email to private, every commit youāve ever made contains an email address in the git log. Unless youāve used GitHubās no-reply email address from day one, your real email is embedded in repository history forever.
npm and PyPI ā Package Registry Exposure
Publishing a package means publishing your identity:
- package.json contains author name and email by default
- PyPI setup.py / pyproject.toml includes author and author_email fields
- Registry profiles show all packages you maintain
- Download statistics indicate your influence and reach
- Changelog/commit links tie back to your GitHub identity
Even if youāve since removed your email from package.json, older versions on npm still contain it. Package registries maintain historical versions indefinitely.
Stack Overflow ā Professional Profile Building
Stack Overflow profiles provide brokers with:
- Your real name (most developers use their real names for reputation building)
- Technical expertise (inferred from tags you answer in)
- Experience level (inferred from reputation score and answer quality)
- Location and timezone (from profile settings)
- Links to GitHub, personal sites, and other profiles
- Employment history (many devs list current employer)
Other Developer-Specific Sources
- Conference talk listings ā Speaker bios, employer, photo, social links
- Domain WHOIS records ā Home address if you didnāt use privacy protection
- LinkedIn ā The most-scraped professional platform, period
- Blog posts ā Real name, opinions, employer mentions
- Open-source contributor lists ā CONTRIBUTORS.md, AUTHORS files
- Mailing list archives ā Email addresses in public list archives
What Data Brokers Actually Collect
Let me paint a picture of what a typical ādeveloper profileā looks like in a brokerās database:
Name: John Developer
Emails: john@personal.com, john.dev@company.com, jdev@oldstartup.io
Phone: (555) 123-4567
Address: 123 Code Street, San Francisco, CA 94102
Employer: TechCorp (current), StartupX (2022-2024), BigCo (2019-2022)
Skills: TypeScript, React, Node.js, AWS, Python, Kubernetes
GitHub: github.com/johndev (2,400 followers, 89 repos)
Influence Score: High (top 5% on Stack Overflow)
Salary Estimate: $180,000-$220,000
Education: CS degree, State University
Age: 32
Related people: [family members]
Property records: [home ownership data]
That profile was built entirely from public data sources combined with property records and phone databases. No hacking required.
Who Buys Developer Data and What They Pay
The buyers of developer data include:
-
Recruiting firms ā Pay $0.10-$2.00 per profile for bulk developer data. They use it for cold outreach campaigns.
-
Marketing companies ā Target developers with tool and service ads. Your Stack Overflow activity tells them exactly which tools you use.
-
Background check companies ā Compile reports for employers. Your entire git history becomes part of pre-employment screening.
-
Scammers and social engineers ā Use detailed profiles to craft targeted phishing attacks. Knowing your tech stack makes phishing emails much more convincing.
-
Data enrichment companies ā Buy from one broker, add their own data, sell to another broker at a markup. Your data gets recycled through dozens of companies.
Developer profiles command premium prices because they represent high-income individuals with specific, targetable interests. A generic consumer profile might sell for pennies; a senior developer profile with verified contact info can go for $5-$20 each in bulk.
How to Check If Youāre Exposed
Before panicking, letās assess the damage. Hereās how to check whatās already out there:
Step 1: Google Yourself
Start simple. Search for:
- āYour Nameā + developer
- āYour Nameā + GitHub
- Your email addresses (in quotes)
- Your phone number
Look beyond page 1. Broker sites often rank on pages 2-5.
Step 2: Check Common Broker Sites
Visit these sites and search for yourself:
- Spokeo
- BeenVerified
- Whitepages
- PeopleFinder
- Intelius
- ZoomInfo (professional/developer focused)
- Clearbit (tech company focused)
Step 3: Check Your Git History
Run this in any public repository youāve contributed to:
git log --format='%ae' | sort -u
Every unique email there is potentially in a brokerās database.
Step 4: Check npm/PyPI
Look at your published packages. Check the older versions:
npm info your-package-name | grep -i "email\|author"
Step 5: WHOIS Lookup
If you own domains, check if WHOIS privacy is active:
whois yourdomain.com | grep -i "registrant"
How to Remove Your Data
You have two paths: manual and automated.
The Manual Path (Not Recommended)
Each data broker has an opt-out process. Some are straightforward web forms. Others require:
- Mailed physical letters
- Faxed documents
- Notarized identity verification
- Phone calls during business hours
- Repeated submissions when they āloseā your request
There are hundreds of brokers. Even if each opt-out takes just 10 minutes, youāre looking at weeks of work. And brokers re-acquire data constantly, so youād need to repeat this every few months.
The Automated Path (Recommended)
This is where data removal services earn their keep. Incogni handles the entire process automaticallyācontacting hundreds of brokers, submitting opt-out requests, following up on ignored requests, and fighting rejected claims. It covers the US, UK, EU, Switzerland, and Canada, which matters because developer data crosses borders constantly.
At ~$6.49/month on the annual plan, it costs less than the hourly value of doing one manual opt-out. And it runs continuously, catching re-listings and new brokers.
Preventing Future Exposure
Removing existing data is half the battle. Hereās how to minimize future exposure:
GitHub Settings
- Use GitHubās no-reply email for commits:
username@users.noreply.github.com - Set email to private in GitHub settings
- Consider whether you need your real name, location, and employer in your bio
- Review organization memberships visibility
Package Registries
- Use an email alias for package author fields (see our password manager guide for tools that offer email aliases)
- Consider using an org/team account for package publishing
- Review and update author fields in existing packages
General Practices
- Use a VPN to prevent IP-based location tracking
- Enable WHOIS privacy on all domains
- Use email aliases for conference registrations and mailing lists
- Store sensitive documents in encrypted cloud storage
- Audit your public profiles quarterly
For a comprehensive approach to locking down your online presence, check out our developer privacy checklist.
The Legal Landscape
Data brokers operate legally in most jurisdictions, but regulations are tightening:
- GDPR (EU) ā Gives you the right to request deletion from any company holding your data
- CCPA/CPRA (California) ā Similar deletion rights for California residents
- Various state laws ā Colorado, Virginia, Connecticut, and others have passed privacy laws
Understanding privacy laws by region helps you know your rights. The challenge isnāt legalāitās practical. You have the right to request deletion, but exercising that right across hundreds of brokers manually is impractical.
This is also relevant if youāre building AI tools that handle user data. Our guides on AI code and data privacy and the GDPR guide for developers cover the compliance side.
The Real Risk: Social Engineering
The scariest use of broker data isnāt spam emailsāitās targeted social engineering. With a detailed developer profile, an attacker can:
- Craft phishing emails referencing your actual projects (āI found a bug in your-package-nameā¦ā)
- Impersonate colleagues they know you work with
- Reference real conferences you spoke at
- Target you through family members listed in the same broker databases
- Bypass security questions using publicly available personal details
This ties directly into securing your API keys and following security checklistsābecause the weakest link is often not your code, but the human data available to attackers.
Frequently Asked Questions
Can I just make my GitHub profile private?
You can, but it defeats the purpose of having a GitHub profile in the professional sense. A better approach is to sanitize whatās thereāuse no-reply email for commits, remove sensitive details from your bio, and let a data removal service handle whatās already been scraped. The historical data in broker databases wonāt disappear just because you make your profile private today.
Do data brokers comply with GDPR deletion requests?
Theyāre legally required to under GDPR if they hold data on EU residents. In practice, compliance varies wildly. Some process requests within days. Others ignore them, stall, or āloseā the request. This is exactly why services like Incogni that fight rejected claims are valuableāthey donāt let brokers off the hook.
I only publish under a pseudonym. Am I safe?
Safer, but not immune. If your pseudonym has ever been linked to your real identity anywhere (a conference registration, a payment processor, a domain registration), brokers can and do make that connection. They specialize in linking disparate identities into unified profiles.
How often do brokers re-acquire data after removal?
Frequently. Data broker databases are rebuilt from continuous scraping. A one-time removal might last 3-6 months before your data reappears. This is why continuous monitoring services are importantāthey catch and re-remove data as it resurfaces.
What about the data in old git commits? Can that be removed?
Unfortunately, rewriting git history on public repositories is impractical for most projects (it breaks everyoneās clones). The best approach is: (1) use no-reply emails going forward, (2) accept that historical commit emails are public, and (3) use a data removal service to handle the downstream broker listings that result from that exposure.
Is developer data really worth more than regular consumer data?
Yes, significantly. Developer profiles indicate high income, specific technical interests (useful for targeted marketing), and professional networks. ZoomInfo, for example, charges enterprise clients substantial sums for access to verified tech professional profiles. Your data is literally more valuable because you code for a living.