upvote
Maybe I'm too old school, but both GitHub and Codeberg for me are asyncronous "I want to send/share the code somehow", not "my active workspace I require to do work". But reading

> the worst thing ever for me as a developer is having the urge to code and not being able to access my remote.

Makes it seem like GitHub/Codeberg has to be online for you to be able to code, is that really the case? If so, how does that happen, you only edit code directly in the GitHub web UI or how does one end up in that situation?

reply
For me it's a soft block rather than a hard block. I use multiple computers so when I switch to the other one I usually do a git pull, and after every commit I do a push. If that gets interrupted, then I have resort to things like rsyncing over from the other system, but more than once I've lost work that way. I'm strongly considering just standing up a VM and using "just git" and foregoing any UI, but I make use of other features like CI/CD and Releases for distribution, so the VM strategy is still just a bandaid. When the remote is unavailable, it can be very disruptive.
reply
> If that gets interrupted, then I have resort to things like rsyncing over from the other system

I'm guessing you have SSH access between the two? You could just add it as another remote, via SSH, so you can push/pull directly between the two. This is what I do on my home network to sync configs and other things between various machines and OSes, just do `git remote add other-host git+ssh://user@10.55/~/the-repo-path` or whatever, and you can use it as any remote :)

Bonus tip: you can use local paths as git remote URLs too!

> but more than once I've lost work that way.

Huh, how? If you didn't push it earlier, you could just push it later? Some goes for pull? I don't understand how you could lose anything tracked in git, corruption or what happened?

reply
Usually one of two things, mostly the latter: I forget to exclude all the .git/ directory from the sync, or I have in-progress and nowhere near ready for commit changes on both hosts, and I forget and sync before I check. These are all PEBKAC problems and/or workflow problems, but on a typical day I'll be working in or around a half-dozen repos and it's too easy to forget. The normal git workflow protects from that because uncommitted changes in one can just be rebased easily the next time I'm working in that on any given computer. I've been doing it like this for nearly 20 years and it's never been an issue because remotes were always quite stable/reliable. I really just need to change my worfklow for the new reality, but old habits die hard.
reply
> just standing up a VM and using "just git"

That's what I do. Control your entire world yourself.

reply
If you can rsync from the other system, and likely have an SSH connection between them, why don't you just add it as an additional remote and git pull from it directly?
reply
I probably could. How does that work with uncommitted changes on the host? Would that be a problem?
reply
You cannot git push something that is not committed. The solution is to commit often (and do it over ssh if you forget on a remote system). It doesn't need to a presentable commit. That can be cleaned up later. I use `git commit -amwip` all the time.

Sure, you might neglect to add a file to your commit, or commit at all, but that's a problem whether you're pushing to a central public git forge or not.

reply
You'd create a bare git repo (just the contents of .git) on the host with git init --bare, separate from your usual working tree, and set it as a remote for your working trees, to which you can push and pull using ssh or even a path from the same machine.
reply
If you have ssh access to the remote machine to set up a git remote, you can login to the remote machine and commit the changes that you forgot to commit.
reply
Roughly:

`ssh remote "cd $src/repo ; git diff" | git apply`

(You'll need to season to taste: what to do with staged changes, how to make sure both trees are in the same HEAD, etc)

reply
For some projects, the issue tracker is a pretty integral part of the documentation. Sure, you can host your own issue tracker somewhere, but that's still shifting a center point somewhere, in a theoretically decentralized system. I've frequently wished the issue tracker was part of the repository. Also -- love them or hate them -- LLMs would probably love that too.
reply
My main exposure to Codeberg is Zig and it has an issue tracker there and I pull in changes from it.

For how infrequent I interface with Codeberg I have to say that my experience has been pretty terrible when it comes to availability.

So I guess the answer is: the availability is bad enough that even infrequent interactions with it are a problem.

reply
> Makes it seem like GitHub/Codeberg has to be online for you to be able to code, is that really the case?

I can understand that work with other active contributors, but I agree with you that it is a daft state of affairs for a solo or mostly-solo project.

Though if you have your repo online even away from the big places, it will get hit by the scrapers and you will end up with admin to do because of that, even if it doesn't block your normal workflow because your main remote is not public.

reply
You’re right this is the proper way to use git. And I encourage developers to use their own cloud storage (or remote volume) for their primary remote.

Even with the best habits, there will be the few times a month where you forgot to push everything up and you’re blocked from work.

Codeberg needs to meet the highest ability levels for it to be viable.

reply
I was shaking my head in disbelief when reading that part too. I mean, git's whole raison d'etre, back when it was introduced, was that you do not need online access to the repo server most of the time.
reply
It's getting even worse if you read the thread about Claude going down the other day. People were having mini panic attacks.
reply
> I mean, git's whole raison d'etre, back when it was introduced, was that you do not need online access to the repo server most of the time.

So what ? That's not how most people prefer to use it.

reply
> git's whole raison d'etre […] was that you do not need online access to the repo server most of the time

Not really. The point of git was to make Linus' job of collating, reviewing, and merging, work from a disparate team of teams much less arduous. It just happens that many of the patterns needed for that also mean making remote temporarily disconnected remote repositories work well.

reply
The whole point of git was tm be a replacement for BitKeeper after the Linux developers got banned from it for "hacking" after Andrew Tridgell connected to the server over telnet and typed "HELP"
reply
That too, though the point of using a distributed code control system was the purpose I mentioned. But even before BitKeeper getting in a tizzy about Tridgell's¹ shenanigans there was talk of replacing it because some properties of it were not ideal for something as large as the kernel with as many active contributors, and there were concerns about using a proprietary product to manage the Linux codebase. Linus was already tinkering with what would become the git we know.

--------

[1] He did a lot more than type “help” - he was essentially trying to reverse engineer the product to produce a compatible but more open client that gave access to metadata BitKeeper wanted you to pay to be able to access² which was a problem for many contributors.

[2] you didn't get the fulllest version history on the free variants, this was one of the significant concerns making people discuss alternatives, and in some high profile cases just plain refuse to touch BitKeeper at all

reply
I've had the same experience.

Philosophically I think it's terrible that Cloudflare has become a middleman in a huge and important swath of the internet. As a user, it largely makes my life much worse. It limits my browser, my ability to protect myself via VPNs, etc, and I am just browsing normally, not attacking anything. Pragmatically though, as a webmaster/admin/whatever you want to call it nowadays, Cloudflare is basically a necessity. I've started putting things behind it because if I don't, 99%+ of my traffic is bots, and often bots clearly scanning for vulnerabilities (I run mostly zero PHP sites, yet my traffic logs are often filled with requests like /admin.php and /wp-admin.php and all the wordpress things, and constant crawls from clearly not search engines that download everything and use robots.txt as a guide of what to crawl rather than what not to crawl. I haven't been DDoSed yet, but I've had images and PDFs and things downloaded so many times by these things that it costs me money. For some things where I or my family are the only legitimate users, I can just firewall-cmd all IPs except my own, but even then it's maintenance work I don't want to have to do.

I've tried many of the alternatives, and they often fail even on legitimate usecases. I've been blocked more by the alternatives than I have by Cloudflare, especially that one that does a proof of work. It works about 80% of the time, but that 20% is really, really annoying to the point that when I see that scren pop up I just browse away.

It's really a disheartening state we find ourselves in. I don't think my principles/values have been tested more in the real world than the last few years.

reply
Either I am very lucky or what I am doing has zero value to bots, because I've been running servers online for at least 15 years, and never had any issue that couldn't be solved with basic security hygiene. I use cloudflare as my DNS for some servers, but I always disable any of their paid features. To me they could go out of business tomorrow and my servers would be chugging along just fine.
reply
Sometime it is not security , it could be just bandwidth or CPU.

I have website small enough not to attract too many bot, but sometime, something very innocent can bring my website down.

For example, I put a php ical viewer.. and some crawler start loading the calendar page, taking up all the cpu cycle.

reply
Even the most minimal protection would stop that.
reply
deleted
reply
> and use robots.txt as a guide of what to crawl rather than what not to crawl

Mental note, make sure my robots.txt files contain a few references to slowly returning pages full of almost nonsense that link back to each other endlessly…

Not complete nonsense, that would be reasonably easy to detect and ignore. Perhaps repeats of your other content with every 5th word swapped with a random one from elsewhere in the content, every 4th word randomly misspelt, every seventh word reversed, every seventh sentence reversed, add a random sprinkling of famous names (Sir John Major, Arc de Triomphe, Sarah Jane Smith, Viltvodle VI) that make little sense in context, etc. Not enough change that automatic crap detection sees it as an obvious trap, but more than enough that ingesting data from your site into any model has enough detrimental effect to token weightings to at least undo any beneficial effect it might have had otherwise.

And when setting traps like this, make sure the response is slow enough that it won't use much bandwidth, and the serving process is very lightweight, and just in case that isn't enough make sure it aborts and errors out if any load metric goes above a given level.

reply
So, basically iocaine (https://iocaine.madhouse-project.org/). It has indeed been very useful to get the AI scraper load on a server I maintain down to a reasonable level, even with its not so strict default configuration.
reply
https://blog.cloudflare.com/ai-labyrinth/

A bit like this? ( iocaine is newer)

reply
First time seeing that, but yes, seems similar in concept. Iocaine can be self-hosted and put in as a "middleware" in your reverse proxy with a few lines of config, cloudflare's seems tied to their services. Cloudflares also generates garbage with generative models, while iocaine uses much simpler (and surely more "crude") methods of generating its garbage. Using LLMs to feed junk to LLMs just makes me cry, so much wasted compute.

Is iocaine actually newer though? Its first commit dates to 2025-01, while the blog post is from 2025-03. I couldn't find info on when Cloudflare started theirs. There's also Nepenthes, which had its first release in 2025-01 too.

reply
If I think about it, I find it awful. The fact that we need to put junk in our own stuff just for crawlers does not sit well with me.
reply
deleted
reply
Hot damn, this is a great idea! Reminds me fondly of an old project a friend and I built that looks like an SSH prompt or optionally an unauthed telnet listener, which looks and feels enough like a real shell that we would capture some pretty fascinating sessions of people trying to explore our system or load us with malware. Eventually somebody figured it out and then DDoSed the hell out of our stuff and would not stop hassling us. It was a good reminder that yanking people's chains sometimes really pisses them off and can attract attention and grudges that you really don't want. My friend ended up retiring his domain because he got tired of dealing with the special attention. It did allow us to capture some pretty fascinating data though that actually improved our security while it lasted.
reply
This is one reason why most crawlers ignore robots.txt now. The other reason is that bandwidth/bots are cheap enough now that they don't need web admins to help them optimize their crawlers
reply
While I sympathise, I disagree with your stance. Cloudflare handle a large % of the Internet now because of people putting sites that, as you admitted, don't need to be behind it there.
reply
[dead]
reply
deleted
reply
OP is about Github. Have you seen the Github uptime monitor? It’s at 90% [1] for the last 90 days. I use both Codeberg and Github a lot and Github has, by far, more problems than Codeberg. Sometimes I notice slowdowns on Codeberg, but that’s it.

[1] https://mrshu.github.io/github-statuses/

reply
I stopped using GitHub a long time ago. I don't understand why gitlab isn't the default alternative?
reply
To be fair, Github has several magnitudes higher of users running on it than Codeberg. I'm also a Codeberg user, but I don't think anyone has seen a Forgejo/Gitea instance working at the scale of Github yet.
reply
I don't think OP was making a value judgment or anything. It's just weird to say you won't consider Codeberg because you need reliability when Codeberg's uptime is at 100% and Github's is at 90%.
reply
To be fair, GitHub has several magnitudes higher of revenue to support that. Including from companies like mine who are paying them good money and get absolutely sub-par service and reliability from them. I'd be happy for Codeberg to take my money for a better service on the core feature set (git hosting, PRs, issues). I can take my CI/CD elsewhere, we self-host runners anyway.
reply
I think the idea is that a Forgejo/Gitea instance should never have to work at anywhere near the scale of GitHub. Codeberg provides its Forgejo host as a convenience/community thing but it's not being built to be a central service.
reply
My own git server has been hit severely by scrapers. They're scraping everything. Commits, comparisons between commits, api calls for files, everything.

And pretty much all of them, ByteDance, OpenAI, AWS, Claude, various I couldn't recognize. I basically just had to block all of them to get reasonable performance for a server running on a mini-pc.

I was going to move to codeberg at some point, but they had downtime when I was considering it, I'd rather deal with that myself then.

reply
Anyone actually scraping git repos would probably just do a 'git clone'. Crawling git hosts is extremely expensive, as git servers have always been inadvertent crawler traps.

They generate a URL for every version of every file on every commit and every branch and tag, and if that wasn't enough, n(n+1)/2 git diffs for every file on every commit it has exited on. Even a relatively small git repo with a few hundred files and commit explodes into millions of URLs in the crawl frontier. Server side many of these are very expensive to generate as well so it's really not a fantastic interaction, crawler and git host.

If you run a web crawler, you need to add git host detection to actively avoid walking into them.

reply
And yet, it's exactly what all the AI companies are doing. However much it costs them in server costs and good will seems to be worth less to them then the engineering time to special case the major git web UIs.
reply
I doubt they're actually interested in the git repos.

From the shape of the traffic it just looks like a poorly implemented web crawler. By default, a crawler that does not take measures to actively avoid git hosts will get stuck there and spend days trying to exhaust the links of even a single repo.

reply
How probable is your "probably"?
reply
Well, one is 60 repos per hour, and the other is 60 hours per repo.
reply
The whole point of git is to be decentralized so there is no reason for you to not have your current version available even when a remote is offline.
reply
How do people even on hacker news of all places conflate git with a code hosting platform all the time? Codeberg, GitHub or whatever are for tracking issues, running CI, hosting builds, and much more.

The idea that you shouldn't need a code hosting platform because git is decentralized is so out of place that it is genuinely puzzling how often it pops up.

reply
How do people on hacker news keep having reading issues?

The parent post mentionned: "the worst thing ever for me as a developer is having the urge to code and not being able to access my remote."

Emphasis one "code", not triaging issues, merging other people's branches, etc.

Besides there are tools to sync forgejo repositories including PRs and issues.

reply
OP didn't conflate them.

They said they want to be able to rely on their git remote.

The people responding are saying "nah, an unreliable remote is fine because you can use other remotes" which doesn't address their problem. If Codeberg is unreliable, then why use it at all? Especially for CI, issues, and collab?

reply
The person you’re replying to is saying that you can do everything outside of tracking issues, running CI, ... without a remote. Like all Git operations that are not about collaboration. (but there is always email)

Maybe a hard blocker if you are pair programming or collaborating every minute. Not really if you just have one hour to program solo.

reply
It's also trivial to have multiple remotes, I do in most of my repos. When one has issues I just push to the other instead of both.
reply
> But they still have downtime

Thank God GitHub is... oh.

https://mrshu.github.io/github-statuses/

reply
Probably has happened at some point, but personally, I have not been hit with/experienced downtime of Codeberg yet. The other day however GitHub was down again. I have not used Gitlab for a while, and when I used it, it worked fine, and its CI seems saner than Github's to me, but Gitlab is not the most snappy user experience either.

Well, Codeberg doesn't have all the features I did use of Gitlab, but for my own projects I don't really need them either.

reply
> for me as a developer is having the urge to code and not being able to access my remote

I think that's the moment when you choose to self host your whatever git wrapper. It really isn't that complicated to do and even allows for some fun (as in cheap and productive) setups where your forge is on your local network or really close to your region and you (maybe) only mirror or backup to a bigger system like Codeberg/GitHub.

In our case, we also use that as an opportunity to mirror OCI/package repositories for dependencies we use in our apps and during development so not only builds are faster but also we don't abuse free web endpoints with our CI/CD requests.

reply
I agree. I switched to Codeberg but switched back after a few months. Funny enough, I found there to be more unreported downtime on Codeberg than GitHub.
reply
deleted
reply
> Lazy has nothing to do with it, codeberg simply doesn't work.

Been working on it for months now, it does work, lol.

reply
I find irony in that Git was made to get rid of central repos, and then we re-introduce them.
reply
That is what we have been doing for quite some time now, from what I gathered. Every time I see something becoming popular, I am like "Hmm, I've seen this before", and I really have. They just gave it a fancier name with a fancier logo and did some marketing and there you go, old is new.
reply
[flagged]
reply
[flagged]
reply
Thanks for your input
reply