undefined

upvote

points

by INTPenis7 hours ago |

upvote

by embedding-shape7 hours ago|

[-]

Maybe I'm too old school, but both GitHub and Codeberg for me are asyncronous "I want to send/share the code somehow", not "my active workspace I require to do work". But reading

> the worst thing ever for me as a developer is having the urge to code and not being able to access my remote.

Makes it seem like GitHub/Codeberg has to be online for you to be able to code, is that really the case? If so, how does that happen, you only edit code directly in the GitHub web UI or how does one end up in that situation?

reply

upvote

by freedomben7 hours ago|

[-]

For me it's a soft block rather than a hard block. I use multiple computers so when I switch to the other one I usually do a git pull, and after every commit I do a push. If that gets interrupted, then I have resort to things like rsyncing over from the other system, but more than once I've lost work that way. I'm strongly considering just standing up a VM and using "just git" and foregoing any UI, but I make use of other features like CI/CD and Releases for distribution, so the VM strategy is still just a bandaid. When the remote is unavailable, it can be very disruptive.

reply

upvote

by embedding-shape7 hours ago|

[-]

> If that gets interrupted, then I have resort to things like rsyncing over from the other system

I'm guessing you have SSH access between the two? You could just add it as another remote, via SSH, so you can push/pull directly between the two. This is what I do on my home network to sync configs and other things between various machines and OSes, just do `git remote add other-host git+ssh://user@10.55/~/the-repo-path` or whatever, and you can use it as any remote :)

Bonus tip: you can use local paths as git remote URLs too!

> but more than once I've lost work that way.

Huh, how? If you didn't push it earlier, you could just push it later? Some goes for pull? I don't understand how you could lose anything tracked in git, corruption or what happened?

reply

upvote

by freedomben7 hours ago|

[-]

Usually one of two things, mostly the latter: I forget to exclude all the .git/ directory from the sync, or I have in-progress and nowhere near ready for commit changes on both hosts, and I forget and sync before I check. These are all PEBKAC problems and/or workflow problems, but on a typical day I'll be working in or around a half-dozen repos and it's too easy to forget. The normal git workflow protects from that because uncommitted changes in one can just be rebased easily the next time I'm working in that on any given computer. I've been doing it like this for nearly 20 years and it's never been an issue because remotes were always quite stable/reliable. I really just need to change my worfklow for the new reality, but old habits die hard.

reply

upvote

by SoftTalker7 hours ago|

[-]

> just standing up a VM and using "just git"

That's what I do. Control your entire world yourself.

reply

upvote

by messe7 hours ago|

[-]

If you can rsync from the other system, and likely have an SSH connection between them, why don't you just add it as an additional remote and git pull from it directly?

reply

upvote

by freedomben7 hours ago|

[-]

I probably could. How does that work with uncommitted changes on the host? Would that be a problem?

reply

upvote

by rlpb6 hours ago|

[-]

You cannot git push something that is not committed. The solution is to commit often (and do it over ssh if you forget on a remote system). It doesn't need to a presentable commit. That can be cleaned up later. I use `git commit -amwip` all the time.

Sure, you might neglect to add a file to your commit, or commit at all, but that's a problem whether you're pushing to a central public git forge or not.

reply

upvote

by debugnik6 hours ago|

[-]

You'd create a bare git repo (just the contents of .git) on the host with git init --bare, separate from your usual working tree, and set it as a remote for your working trees, to which you can push and pull using ssh or even a path from the same machine.

reply

upvote

by thwarted6 hours ago|

[-]

If you have ssh access to the remote machine to set up a git remote, you can login to the remote machine and commit the changes that you forgot to commit.

reply

upvote

by mceachen5 hours ago|

[-]

Roughly:

`ssh remote "cd $src/repo ; git diff" | git apply`

(You'll need to season to taste: what to do with staged changes, how to make sure both trees are in the same HEAD, etc)

reply

upvote

by cyberrock5 hours ago|

[-]

For some projects, the issue tracker is a pretty integral part of the documentation. Sure, you can host your own issue tracker somewhere, but that's still shifting a center point somewhere, in a theoretically decentralized system. I've frequently wished the issue tracker was part of the repository. Also -- love them or hate them -- LLMs would probably love that too.

reply

upvote

by the_mitsuhiko6 hours ago|

[-]

My main exposure to Codeberg is Zig and it has an issue tracker there and I pull in changes from it.

For how infrequent I interface with Codeberg I have to say that my experience has been pretty terrible when it comes to availability.

So I guess the answer is: the availability is bad enough that even infrequent interactions with it are a problem.

reply

upvote

by dspillett7 hours ago|

[-]

> Makes it seem like GitHub/Codeberg has to be online for you to be able to code, is that really the case?

I can understand that work with other active contributors, but I agree with you that it is a daft state of affairs for a solo or mostly-solo project.

Though if you have your repo online even away from the big places, it will get hit by the scrapers and you will end up with admin to do because of that, even if it doesn't block your normal workflow because your main remote is not public.

reply

upvote

by tonymet6 hours ago|

[-]

You’re right this is the proper way to use git. And I encourage developers to use their own cloud storage (or remote volume) for their primary remote.

Even with the best habits, there will be the few times a month where you forgot to push everything up and you’re blocked from work.

Codeberg needs to meet the highest ability levels for it to be viable.

reply

upvote

by pferde7 hours ago|

[-]

I was shaking my head in disbelief when reading that part too. I mean, git's whole raison d'etre, back when it was introduced, was that you do not need online access to the repo server most of the time.

reply

upvote

by sodapopcan7 hours ago|

[-]

It's getting even worse if you read the thread about Claude going down the other day. People were having mini panic attacks.

reply

upvote

by ragall1 hours ago|

[-]

> I mean, git's whole raison d'etre, back when it was introduced, was that you do not need online access to the repo server most of the time.

So what ? That's not how most people prefer to use it.

reply

upvote

by dspillett6 hours ago|

[-]

> git's whole raison d'etre […] was that you do not need online access to the repo server most of the time

Not really. The point of git was to make Linus' job of collating, reviewing, and merging, work from a disparate team of teams much less arduous. It just happens that many of the patterns needed for that also mean making remote temporarily disconnected remote repositories work well.

reply

upvote

by dwedge6 hours ago|

[-]

The whole point of git was tm be a replacement for BitKeeper after the Linux developers got banned from it for "hacking" after Andrew Tridgell connected to the server over telnet and typed "HELP"

reply

upvote

by dspillett5 hours ago|

[-]

That too, though the point of using a distributed code control system was the purpose I mentioned. But even before BitKeeper getting in a tizzy about Tridgell's¹ shenanigans there was talk of replacing it because some properties of it were not ideal for something as large as the kernel with as many active contributors, and there were concerns about using a proprietary product to manage the Linux codebase. Linus was already tinkering with what would become the git we know.

--------

[1] He did a lot more than type “help” - he was essentially trying to reverse engineer the product to produce a compatible but more open client that gave access to metadata BitKeeper wanted you to pay to be able to access² which was a problem for many contributors.

[2] you didn't get the fulllest version history on the free variants, this was one of the significant concerns making people discuss alternatives, and in some high profile cases just plain refuse to touch BitKeeper at all

reply

upvote

by freedomben7 hours ago|

[-]

I've had the same experience.

Philosophically I think it's terrible that Cloudflare has become a middleman in a huge and important swath of the internet. As a user, it largely makes my life much worse. It limits my browser, my ability to protect myself via VPNs, etc, and I am just browsing normally, not attacking anything. Pragmatically though, as a webmaster/admin/whatever you want to call it nowadays, Cloudflare is basically a necessity. I've started putting things behind it because if I don't, 99%+ of my traffic is bots, and often bots clearly scanning for vulnerabilities (I run mostly zero PHP sites, yet my traffic logs are often filled with requests like /admin.php and /wp-admin.php and all the wordpress things, and constant crawls from clearly not search engines that download everything and use robots.txt as a guide of what to crawl rather than what not to crawl. I haven't been DDoSed yet, but I've had images and PDFs and things downloaded so many times by these things that it costs me money. For some things where I or my family are the only legitimate users, I can just firewall-cmd all IPs except my own, but even then it's maintenance work I don't want to have to do.

I've tried many of the alternatives, and they often fail even on legitimate usecases. I've been blocked more by the alternatives than I have by Cloudflare, especially that one that does a proof of work. It works about 80% of the time, but that 20% is really, really annoying to the point that when I see that scren pop up I just browse away.

It's really a disheartening state we find ourselves in. I don't think my principles/values have been tested more in the real world than the last few years.

reply

upvote

by rglullis6 hours ago|

[-]

Either I am very lucky or what I am doing has zero value to bots, because I've been running servers online for at least 15 years, and never had any issue that couldn't be solved with basic security hygiene. I use cloudflare as my DNS for some servers, but I always disable any of their paid features. To me they could go out of business tomorrow and my servers would be chugging along just fine.

reply

upvote

by j16sdiz5 hours ago|

[-]

Sometime it is not security , it could be just bandwidth or CPU.

I have website small enough not to attract too many bot, but sometime, something very innocent can bring my website down.

For example, I put a php ical viewer.. and some crawler start loading the calendar page, taking up all the cpu cycle.

reply

upvote

by rglullis4 hours ago|

[-]

Even the most minimal protection would stop that.

reply

upvote

by 6 hours ago|

[-]

deleted

reply

upvote

by dspillett6 hours ago|

[-]

> and use robots.txt as a guide of what to crawl rather than what not to crawl

Mental note, make sure my robots.txt files contain a few references to slowly returning pages full of almost nonsense that link back to each other endlessly…

Not complete nonsense, that would be reasonably easy to detect and ignore. Perhaps repeats of your other content with every 5th word swapped with a random one from elsewhere in the content, every 4th word randomly misspelt, every seventh word reversed, every seventh sentence reversed, add a random sprinkling of famous names (Sir John Major, Arc de Triomphe, Sarah Jane Smith, Viltvodle VI) that make little sense in context, etc. Not enough change that automatic crap detection sees it as an obvious trap, but more than enough that ingesting data from your site into any model has enough detrimental effect to token weightings to at least undo any beneficial effect it might have had otherwise.

And when setting traps like this, make sure the response is slow enough that it won't use much bandwidth, and the serving process is very lightweight, and just in case that isn't enough make sure it aborts and errors out if any load metric goes above a given level.

reply

upvote

by matrss5 hours ago|

[-]

So, basically iocaine (https://iocaine.madhouse-project.org/). It has indeed been very useful to get the AI scraper load on a server I maintain down to a reasonable level, even with its not so strict default configuration.

reply

upvote

by willx865 hours ago|

[-]

https://blog.cloudflare.com/ai-labyrinth/

A bit like this? ( iocaine is newer)

reply

upvote

by matrss3 hours ago|

[-]

First time seeing that, but yes, seems similar in concept. Iocaine can be self-hosted and put in as a "middleware" in your reverse proxy with a few lines of config, cloudflare's seems tied to their services. Cloudflares also generates garbage with generative models, while iocaine uses much simpler (and surely more "crude") methods of generating its garbage. Using LLMs to feed junk to LLMs just makes me cry, so much wasted compute.

Is iocaine actually newer though? Its first commit dates to 2025-01, while the blog post is from 2025-03. I couldn't find info on when Cloudflare started theirs. There's also Nepenthes, which had its first release in 2025-01 too.

reply

upvote

by johnisgood5 hours ago|

[-]

If I think about it, I find it awful. The fact that we need to put junk in our own stuff just for crawlers does not sit well with me.

reply

upvote

by 5 hours ago|

[-]

deleted

reply

upvote

by freedomben5 hours ago|

[-]

Hot damn, this is a great idea! Reminds me fondly of an old project a friend and I built that looks like an SSH prompt or optionally an unauthed telnet listener, which looks and feels enough like a real shell that we would capture some pretty fascinating sessions of people trying to explore our system or load us with malware. Eventually somebody figured it out and then DDoSed the hell out of our stuff and would not stop hassling us. It was a good reminder that yanking people's chains sometimes really pisses them off and can attract attention and grudges that you really don't want. My friend ended up retiring his domain because he got tired of dealing with the special attention. It did allow us to capture some pretty fascinating data though that actually improved our security while it lasted.

reply

upvote

by Ferret74461 hours ago|

[-]

This is one reason why most crawlers ignore robots.txt now. The other reason is that bandwidth/bots are cheap enough now that they don't need web admins to help them optimize their crawlers

reply

upvote

by dwedge6 hours ago|

[-]

While I sympathise, I disagree with your stance. Cloudflare handle a large % of the Internet now because of people putting sites that, as you admitted, don't need to be behind it there.

reply

upvote

by kitsune15 hours ago|

[-]

[dead]

reply

upvote

by 6 hours ago|

[-]

deleted

reply

upvote

by frevib6 hours ago|

[-]

OP is about Github. Have you seen the Github uptime monitor? It’s at 90% [1] for the last 90 days. I use both Codeberg and Github a lot and Github has, by far, more problems than Codeberg. Sometimes I notice slowdowns on Codeberg, but that’s it.

[1] https://mrshu.github.io/github-statuses/

reply

upvote

by INTPenis49 minutes ago|

[-]

I stopped using GitHub a long time ago. I don't understand why gitlab isn't the default alternative?

reply

upvote

by throwaway2873120 minutes ago|

[-]

[dead]

reply

upvote

by kevinfiol5 hours ago|

[-]

To be fair, Github has several magnitudes higher of users running on it than Codeberg. I'm also a Codeberg user, but I don't think anyone has seen a Forgejo/Gitea instance working at the scale of Github yet.

reply

upvote

by apetresc5 hours ago|

[-]

I don't think OP was making a value judgment or anything. It's just weird to say you won't consider Codeberg because you need reliability when Codeberg's uptime is at 100% and Github's is at 90%.

reply

upvote

by jrudolph5 hours ago|

[-]

To be fair, GitHub has several magnitudes higher of revenue to support that. Including from companies like mine who are paying them good money and get absolutely sub-par service and reliability from them. I'd be happy for Codeberg to take my money for a better service on the core feature set (git hosting, PRs, issues). I can take my CI/CD elsewhere, we self-host runners anyway.

reply

upvote

by era-epoch3 hours ago|

[-]

I think the idea is that a Forgejo/Gitea instance should never have to work at anywhere near the scale of GitHub. Codeberg provides its Forgejo host as a convenience/community thing but it's not being built to be a central service.

reply

upvote

by kjuulh7 hours ago|

[-]

My own git server has been hit severely by scrapers. They're scraping everything. Commits, comparisons between commits, api calls for files, everything.

And pretty much all of them, ByteDance, OpenAI, AWS, Claude, various I couldn't recognize. I basically just had to block all of them to get reasonable performance for a server running on a mini-pc.

I was going to move to codeberg at some point, but they had downtime when I was considering it, I'd rather deal with that myself then.

reply

upvote

by marginalia_nu6 hours ago|

[-]

Anyone actually scraping git repos would probably just do a 'git clone'. Crawling git hosts is extremely expensive, as git servers have always been inadvertent crawler traps.

They generate a URL for every version of every file on every commit and every branch and tag, and if that wasn't enough, n(n+1)/2 git diffs for every file on every commit it has exited on. Even a relatively small git repo with a few hundred files and commit explodes into millions of URLs in the crawl frontier. Server side many of these are very expensive to generate as well so it's really not a fantastic interaction, crawler and git host.

If you run a web crawler, you need to add git host detection to actively avoid walking into them.

reply

upvote

by Tharre5 hours ago|

[-]

And yet, it's exactly what all the AI companies are doing. However much it costs them in server costs and good will seems to be worth less to them then the engineering time to special case the major git web UIs.

reply

upvote

by marginalia_nu4 hours ago|

[-]

I doubt they're actually interested in the git repos.

From the shape of the traffic it just looks like a poorly implemented web crawler. By default, a crawler that does not take measures to actively avoid git hosts will get stuck there and spend days trying to exhaust the links of even a single repo.

reply

upvote

by Eldt5 hours ago|

[-]

How probable is your "probably"?

reply

upvote

by marginalia_nu4 hours ago|

[-]

Well, one is 60 repos per hour, and the other is 60 hours per repo.

reply

upvote

by prmoustache7 hours ago|

[-]

The whole point of git is to be decentralized so there is no reason for you to not have your current version available even when a remote is offline.

reply

upvote

by mr_mitm6 hours ago|

[-]

How do people even on hacker news of all places conflate git with a code hosting platform all the time? Codeberg, GitHub or whatever are for tracking issues, running CI, hosting builds, and much more.

The idea that you shouldn't need a code hosting platform because git is decentralized is so out of place that it is genuinely puzzling how often it pops up.

reply

upvote

by prmoustache2 hours ago|

[-]

How do people on hacker news keep having reading issues?

The parent post mentionned: "the worst thing ever for me as a developer is having the urge to code and not being able to access my remote."

Emphasis one "code", not triaging issues, merging other people's branches, etc.

Besides there are tools to sync forgejo repositories including PRs and issues.

reply

upvote

by hombre_fatal6 hours ago|

[-]

OP didn't conflate them.

They said they want to be able to rely on their git remote.

The people responding are saying "nah, an unreliable remote is fine because you can use other remotes" which doesn't address their problem. If Codeberg is unreliable, then why use it at all? Especially for CI, issues, and collab?

reply

upvote

by keybored5 hours ago|

[-]

The person you’re replying to is saying that you can do everything outside of tracking issues, running CI, ... without a remote. Like all Git operations that are not about collaboration. (but there is always email)

Maybe a hard blocker if you are pair programming or collaborating every minute. Not really if you just have one hour to program solo.

reply

upvote

by dandellion7 hours ago|

[-]

It's also trivial to have multiple remotes, I do in most of my repos. When one has issues I just push to the other instead of both.

reply

upvote

by ori_b6 hours ago|

[-]

> But they still have downtime

Thank God GitHub is... oh.

https://mrshu.github.io/github-statuses/

reply

upvote

by zelphirkalt6 hours ago|

[-]

Probably has happened at some point, but personally, I have not been hit with/experienced downtime of Codeberg yet. The other day however GitHub was down again. I have not used Gitlab for a while, and when I used it, it worked fine, and its CI seems saner than Github's to me, but Gitlab is not the most snappy user experience either.

Well, Codeberg doesn't have all the features I did use of Gitlab, but for my own projects I don't really need them either.

reply

upvote

by iamkonstantin5 hours ago|

[-]

> for me as a developer is having the urge to code and not being able to access my remote

I think that's the moment when you choose to self host your whatever git wrapper. It really isn't that complicated to do and even allows for some fun (as in cheap and productive) setups where your forge is on your local network or really close to your region and you (maybe) only mirror or backup to a bigger system like Codeberg/GitHub.

In our case, we also use that as an opportunity to mirror OCI/package repositories for dependencies we use in our apps and during development so not only builds are faster but also we don't abuse free web endpoints with our CI/CD requests.

reply

upvote

by nfredericks5 hours ago|

[-]

I agree. I switched to Codeberg but switched back after a few months. Funny enough, I found there to be more unreported downtime on Codeberg than GitHub.

reply

upvote

by 5 hours ago|

[-]

deleted

reply

upvote

by maelito5 hours ago|

[-]

> Lazy has nothing to do with it, codeberg simply doesn't work.

Been working on it for months now, it does work, lol.

reply

upvote

by z3t45 hours ago|

[-]

I find irony in that Git was made to get rid of central repos, and then we re-introduce them.

reply

upvote

by johnisgood5 hours ago|

[-]

That is what we have been doing for quite some time now, from what I gathered. Every time I see something becoming popular, I am like "Hmm, I've seen this before", and I really have. They just gave it a fancier name with a fancier logo and did some marketing and there you go, old is new.

reply

upvote

by mixmastamyk6 hours ago|

[-]

[flagged]

reply

upvote

by youarewashed7 hours ago|

[-]

[flagged]

reply

upvote

by DaSHacka7 hours ago|

[-]

Thanks for your input

reply