Reading a $120 million worthy unicorn startup’s private codebase on a fine Sunday afternoon while sipping on coffee would be an exciting experience, wouldn’t it? It’s not a very uncommon phenomenon. Only last year, Twitch was added to the long line of organizations whose source code has been made public inadvertently. Twitch’s source code, which includes 6,000 internal Git repositories and 3,000,000 documents with a combined unzipped size of 200GB, was exposed to the 4chan forum.
Here’s how we got access to 159 private codebases of 13 organizations ranging from small-scale startups to leading unicorns.
It’s not unusual for developers to hardcode sensitive information in their source code and then submit it to popular code-sharing platforms like GitHub. According to GitGuardian, 6 million hardcoded secrets will be discovered in 2021, with India being the leading source of leaks and an increase of 2 times compared to 2020. Considering that GitGuardian’s studies are limited to only public repositories hosted on GitHub or GitLab and not secrets being committed in private repositories or self-hosted git clients it is astounding that no directed research has been conducted to reveal the different use cases of these hardcoded tokens apart from version-control platforms. Hence, in our study, we directly explore the source code of millions of mobile apps and find out these instances of leaks directly. In this talk, we plan on investigating the causes, impacts, and techniques that can be used to prevent such leaks. Further, we’ll be giving you a sneak peek into some of our interesting findings.
Ashikka is a junior year student at VIT Vellore. She takes a keen interest in anything cybersecurity-related. She is currently working as a Security Research + Technical Writing Intern at CloudSEK, India where she heads the content department for BeVigil, the world’s first security search engine. In her free time, she enjoys reading, hiking, and building cool projects.