1. How are the repos discovered and maintained?
- (Daily) I have a script that searches through >100 keywords and topics to discover new repos. Repos are annotated by AI (gpt-5.2-mini). Its agreement with my annotations is ~75%, but most of the disagreements are the repos that I myself am uncertain how to categorize.
- (Daily) I have another script that updates existing repos.
- (Once in a while) I have another script that uses AI to annotate locations for repos and developers.
2. Why do you only show repos with >= 500 stars?
GitHub rate limits. I'm using free GitHub API so I can only query a certain number of repos an hour. Reducing the number of minimum stars will make the number of repos I need to query too much.
If you want to sponsor the project to switch to paid API, email me!
3. Is star diff a good indication for good repos?
The goal of the project is to help you discover trending repos. Evaluating a repo is up to you. What I've found that if a repo is no longer maintained or not super useful, its growth will taper off quickly.
4. What insights did you get from looking at so many repos?
I like doing data analysis, and here is what I've learned from analyzing these repos in the past.
- 2024 open source AI landscape analysis
- 2021 open source AI landscape analysis
- 2020 open source AI landscape analysis
To be updated about my future analysis, check out my substack, X, or LinkedIn.
If you want to collaborate on market research, email me!