GVFS sounds super dumb
Help me out here, what's the point? It sounds like Git's antiparticle.
Microsoft has adopted Git to manage the vast collection of code that is Windows' source, and has shared performance issues it's had to fix along the way. The state-of-the-nation report for what Microsoft calls the “largest Git repo on the planet” follows on from its launch of the “fat Git repo” handler, the Git Virtual File …
There are tons of companies which store source to all their products, and presumably their tax returns and porn stash, in a single giant repo. Perforce and ClearCase actually encourage such a way of working.
Now those companies may want to use "git", but of course not to the extent that they would change their way of working and split up their repo a bit. So now they can buy "Microsoft git" which presumably has some token integration with actual git, but for the 30gb repo support you have to use Visual Studio and not the normal git client.
I suppose "embrace and extend" is still a thing at MS.
"but for the 30gb repo support you have to use Visual Studio and not the normal git client."
First of all, you can check out the gvfs source code yourself on github
Secondly, VS 2017 simply uses git.exe to do git related tasks. VS 2015 took a different approach, putting all the git functionality inside a dll, which was probably convenient at the time but ate quite a lot of memory (the VS team's biggest sin is ignoring 64-bit support for over a decade now).
AFAICT gvfs is simply a layer under git that allows the developers to avoid pulling in the entire repository. Few developers are likely to touch the entire code base, yet the build servers probably need the whole thing.
OTOH, according to the github page, the latest version of Windows 10 *is* a requirement. So some OS support seems to be needed for this to work. I have no idea if this can be ported to other operating systems.
Is it more feasible to force the build servers into pulling thousands of repositories at build time? It would surprise me if the answer is 'yes'.
Now oldcoder, your comment is really, really dumb.
git is running. All the time. All the change tracking is happening. If you ever need the information, it will be downloaded when you need it. When you don't need the information, it's not downloaded. It's still all there in git.
You have 100 teams working on different things. They all use one git repository. Everything any team member looks it is always there - only the things they don't look at are not downloaded.
This post has been deleted by its author
>Powershell is the bee's knees. I was late to come around to it, but I'll never use cmd.exe again.<
And look how many keystrokes I can save by typing 'cp' instead of some verbose COBOL crap like 'copy', intended to make scripting 'easy', so that 'you don't need to be a dev to do scripting'.
As if making readable scripts ever worked. That's the problem with readable scripts: it makes people think that anyone can do it.
This post has been deleted by its author
To be honest, I still don't understand the fascination with Git. The only thing it has going for it (that I can see) is it's free.
Cult of St. Torvalds, I think.
My impression of git is that it feels like something a programmer whipped up in a week or so to scratch an immediate itch, without any thought to user-friendliness or scaling. Which is of course exactly how it originated. He needed to get off BitKeeper ASAP when their license terms became onerous, so he threw something together.
"My impression of git is that it feels like something a programmer whipped up in a week or so to scratch an immediate itch, without any thought to user-friendliness or scaling"
It scales better than SVN and the design is pretty neat - if you bother to understand it. Which takes some effort, as it is indeed quite unconventional (think two-dimensional hierarchy, where one dimension are files and other is commit history). However you make a good point that it was indeed whipped in a hurry, hence upvote.
Give the guy his due.
He wanted to continue using Bitkeeper. Lots of people in/around Linux used it and paid for it (even if they didn't always have to).
Then the owner of the company that make Bitkeeper decided to be a twat because someone from Samba fame started to reverse-engineer it's proprietary formats so they could integrate with it.
He pulled the rug, the software was made unavailable.
So Linus knocked up an alternative in a few days, that pretty much sent Bitkeeper scrambling and now even Microsoft use it, and Bitkeeper is nowhere to be heard of. Since the very early days, it's been almost entirely other people - including Microsoft - developing git, but you have to admire the way that was done.
"Okay, you won't play ball any more, despite it being nothing to do with us kernel developers at all? Okay, I'll write an alternative that's more focused on our process, better for us, and does things yours can't. Oh, look, there it is, done. Bye!".
There aren't many people who can re-write an independent implementation of a large commercial product overnight, that ultimately leads to nobody even touching the other software any more, and Microsoft basing product lines and their entire development process on it.
Heres an interview w Linus about it
https://www.linux.com/blog/10-years-git-interview-git-creator-linus-torvalds
I like git, mostly. I like that it tries to parallel the file system in its use. I like that it makes sense on the command line, doesn't _need_ a daemon and can just be moved by file copies. I am sure that some other version controls do some things better. But it's free, pretty good at what it does and allows you a lot of growth if you want to become expert at it. Subversion never clicked with me, Sourcesafe sucked and Clearcase makes me wonder how its creators feel about creating such a loathed piece of software. So best, in my limited experience, by a long shot
You are being daft here. There is no "Microsoft git". There is git, with all the git commands that you know, using a virtual file system on the client. I'm right now making a living using a git repository of 100 MB. These guys have a repository of 300 GB. Without that virtual file system, git can't handle it. I'll congratulate Microsoft for using a very smart approach to a difficult problem.
I too thought of ClearCase when I read this, with rather mixed memories. I do remember working on a large team where ClearMake really came into its own though, pulling in libraries on the fly that others had compiled. I wonder whatever happened to it... but not enough to google and find out.
I can think of several possible points:
1) If your git repository is 300GB (perhaps because you have several decades of spaghetti dependencies in there) then you don't want to pull it all in at once. The usual DVCS approach of "grab the repo and party on dude" doesn't scale. (Yes I've heard of re-factoring and technical debt. Apparently, despite re-writing Windows from the ground up with every major release, MS haven't.)
2) If your toolchain doesn't support git, you need to make it look more like a normal part of the file system, because everyone supports "normal files". So MS have written a filesystem driver that does that. (According to the blog, they intend to ditch this approach in the longer term, in favour of building git support into NTFS. What ... the ... fsck! Can you spell "retrograde"?)
3) Having done 1 and 2, your next problem is that you don't have all the files locally and still need wire access to the originals, so some kind of proxy might be nice.
I can see that purists might reckon that all this is solving the wrong problem, but if the Right solution is quickly re-factor 300GB of source code then I can also see that MS might be forgiven for pursuing this approach. When you are up to your nose in shit, opening your mouth to call for help isn't necessarily the thing you do first.
SQL Server for Linux is essentially running on a compatiblity layer (line Wine, but not Wine) and Visual Studio for Linux is Xamerin Studio renamed. Microsoft isn't even really attempting to change, they're just pretending they are. Using GIT is just the best option for them at the moment, they've used a lot of source control systems in the past.
".....despite re-writing Windows from the ground up with every major release, MS haven't."
<citation please>
I've never heard this from Microsoft. That would be like saying Linus writes Linux from the ground up with every major release. It's just utterly stupid and incorrect.
The achilles heel for Git is that you must pull ALL the repository in order to use any of the respository. Various ways exist to work around this issue - shallow clones, submodules, subtrees, repo etc. but nothing is very good.
I suppose the idea for GVFS is that when you do a clone of Windows, you don't transfer 300GB of crap to your machine before you even start. Instead you "clone" and the filesystem looks like the files were fetched but the fs only fetches a file's contents on first read. So if you're working on one DLL with 100 files you don't need to download the gazillion other files in the codebase.
Clearcase (contender for the worst source control system ever invented) did this too with a thing called a dynamic view. The difference in Clearcase's case was the dynamic view could change while you were using it if someone else committed files to the same view. Enjoy trying to debug problems when header and sources keep changing underneath you.
At least GVFS would behave like Git in that what you see isn't going to change unless you pull / fetch / merge. I'd like to see how MS intend to open this up outside of themselves though.
>Clearcase (contender for the worst source control system ever
Contender? You're being unduly generous and magnanimous. More like so far ahead that no one else is in the same game.
Add to it quite possibly the worst GUI ever inflicted on users. And the crappiest and flakiest backend Windows services.
Clearcase the worst? Pfeh, I see you never tried Visual SourceSafe. Combine the primitivity of RCS with the complexity of Clearcase - or perhaps it was just the Microsoft's MFC-era designers' capacity to overcomplicate things by exposing the wrong things to the user - and you are close.
Ful disclaimer: I have only ever encountered VSS briefly because, well... see point above. Clearcase might have caused more problems to the world by luring a team in until it is too late, but we are talking about the worst source control system, not the most evil.
No, it's a 300GB repository, not a 300GB code base - which includes all the branches, and large teams like that working on Windows usually do an extensive use of branches, unlike most smaller projects mostly working on a single one, and maybe just using branches only to mark releases and some maintenance.
That's actually a bit unclear, if the total repo is 300gb or a single branch. But note that git pulls one branch at a time so the relevant number for scalability is the size of a single branch.
To put into perspective how ridiculously large this is: the source code for the entirety of Debian is about 270GB. And that contains a vast suite of applications: everything from EDA tools to several office suits to multiple browsers to compilers to FPS games. A total of 28 thousand different packages. Windows is big but not that big
Given this, it is almost a certainty that the 300GB is not just source code. Perhaps it contains the entire build chain. Perhaps they are storing build artefacts in the repo.
I guess it depends if you only do "source code" control or "whole version" control.
It seems likely that they hold everything in there so you can track the code, the compiler settings, the resources and of course the test results.
As others have noted there will likely be different branches for "Home" "Small Business" "Enterprise" editions as well
OTOH I'm not so sure that includes Office, Dynamics or the languages.
But Debian doesn't handle, for example, the whole Linux kernel repository, and all its commits/branches, I guess it just pulls some of it. The same is true for other projects, when they are hosted elsewhere and not directly by Debian.
I have some open source project inside my VCS for libraries and applications I need to build latest versions not directly supported by Debian - but I just pull the stable releases, not the whole commit history.
Inside the Windows repository there are probably all the version of Windows they need to support (which may stretch down to XP, if it's still on paid support), the upcoming ones, plus the SDKs and related development and build tools.
Two different businesses, working in a different way.
"the source code for the entirety of Debian is about 270GB. And that contains a vast suite of applications: everything from EDA tools to several office suits to multiple browsers to compilers to FPS games."
I'd love to be able to modify the behaviour of certain office suits.
Where do I get the sources?
> But note that git pulls one branch at a time ...
Hmmm, not exactly. When doing a "git clone
" (by default) it will grab the entire remote repository (all branches) and set that up locally.
"One branch" is only what gets pulled after that when you're updating thing (eg "git pull
"). And that's only if you've not set up further tracking between your local branches and remote ones (eg: git checkout -b somebranch --track
).
So, "one branch at a time" is kind of yes and no, but mostly not really. ;)
Yeah, I was about to comment on that myself.
I think it's a sad display if you're selling items and then don't use them for your own setup. I mean: doesn't that tell us something about the items you're trying to sell us? I'm always very keen on that myself.
Back in the iPaq days the CEO of Compaq would give speeches and all and what was that one small detail which managed to caught my eye? He didn't use an iPaq, no way: he often used pen and paper to jot down notes. Errr, ok.... So it wasn't that revolutionary product which everyone could use afterall, eh?
Microsoft, back in the days (1990 - 2000), relied on Unix (Sendmail) to handle all their e-mail. Because Exchange just couldn't handle it, rumor even has it that they had tried to implement it a couple of times but that Exchange completely crashed because it simply couldn't handle the load. Now: in all honesty we need to keep in mind that Exchange was more than an MTA alone, so my example is a little bit flawed, But even so...
And there are tons of example. When a company tries to sell you a product after which it turns out that they're not using it themselves then I think something isn't quite right with the product ;)
This post has been deleted by its author