nav search
Data Centre Software Security DevOps Business Personal Tech Science Emergent Tech Bootnotes
BOFH
Lectures

back to article
Google's Grumpy code makes Python Go

Bronze badge

I looked at both Go and Apple iOS Swift. Underwhelmed both times. Is it me?

6
3

Yes, its you.

I can't speak for Swift, but I've written a bunch of stuff in Go. It's very fast (running speed and compile wize), portable across platforms, the syntax is sane and it's very easy to learn.

But it's biggest 'party trick' is built in concurrent processing. Until you've used this, it's hard to explain just how useful this is in modern mobile/web applications.

I evaluated a bunch of 'new' languages for a project a few years ago. Among them Go, D, C#, Closure, Node (not a language but still), Scala, Erlang (and it's derivatives) as well as more traditional languages like Python.

Go was easily the easiest to learn and had the most useful optimizations. It's pretty obvious that whoever was the driving force behind it had modern async, massive scale internet-based applications in mind. Def. my favorite language now, although I wish it had better dictionary/list/array handling as it not as simple to build complex multi-dimensional data structures compared to some other languages.

If you are doing anything on a large scale (mobile, web, big data, etc), then Go is a fantastic language.

10
0
Silver badge

Re: Yes, its you.

If you like Go's concurrency you may also like Rust. That too gives you Channels which can be rendezvous mechanisms.

Rust is beginning to look good because it is suitable as a system's language. There's a whole lot of youngsters at work getting excited about it.

To us old timers the re-emergence of CSP channels is a delight! It's definitely the easiest way to do parallel execution. Rust'll save me having to implement CSP in C++ for myself every time...

3
0
Silver badge

Re: Yes, its you.

...At least you might like Rust once they've decided whether or not to keep channels and select...

Tracking Issue for Channel Selection

2
0

Re: Yes, its you.

It's nice to see that the youngsters have got around to inventing CSP again ;)

4
0
Silver badge

Re: Yes, its you.

"To us old timers the re-emergence of CSP channels is a delight! It's definitely the easiest way to do parallel execution. Rust'll save me having to implement CSP in C++ for myself every time..."

Is using pthreads REALLY so hard? There seems to be a lot of noise about the latest flavour of the month concurrent languages but in reality all they do is prettify (and arguably simplify) threading syntax and control then make the same underlying calls to the OS threading system. They don't actually give you any more power.

Also in all the fuss multi-process seems to be forgotten about. Admittedly on Windows this is a bit of a lost cause due to its piss poor support of it, but on Linux/Unix and OS/X which all support fork() and the Posix multi process support API its an extremely useful tool in any serious system programmers armoury.

3
1

Re: Yes, its you.

> It's nice to see that the youngsters have got around to inventing CSP again ;)

Um, yah. Rob Pike and Ken Thompson are wee ankle-biters.

3
0

Re: Yes, its you.

"They don't actually give you any more power."

Well in principle perhaps, but context switching is expensive. go distributes goroutines across a small number of pthreads (e.g. one per core), and *it* rather than the os decides how optimally to switch between goroutines. You can write an app with huge numbers of goroutines without worrying that the platform has been tuned for the large number of threads, or worrying about the impact of the os thrashing between threads when perhaps it can be avoided. That's a win. And as goroutines are lightweight, and channels between goroutines are built into the language, it makes it easy to use parallelism to implement 'map-reduce' type algorithms, even with hundreds of thousands of concurrent 'threads'. Tuning and efficiency aside, writing go concurrent code is idiotically easy.

This page describes the design choices made:

https://dave.cheney.net/2015/08/08/performance-without-the-event-loop

3
0
Silver badge

Re: Yes, its you.

@boltar,

Is using pthreads REALLY so hard? There seems to be a lot of noise about the latest flavour of the month concurrent languages but in reality all they do is prettify (and arguably simplify) threading syntax and control then make the same underlying calls to the OS threading system. They don't actually give you any more power.

PThreads are harder to get right (from the point of view of being sure there's no potential for deadlock, lock failures, etc). Shared memory, semaphores, etc can be fearsomely difficult to debug, etc.

CSP in particular makes it easier to get it right, or at least if you don't you're guaranteed to find out as soon as you run your system. If you've written it such that it can deadlock, it will deadlock.

Also you can do some process calculi maths and prove the correctness of a CSP design theoretically. That's not something you can do with pthreads, shared memory and semaphores in anything but the most trivial of cases.

As for more power, one has to be a bit careful. Pthreads, shared memory, semaphores all assume that it's running on top of an SMP computer architecture. Whereas CSP is quite content to exist on NUMA architectures. What Intel actually give us is, effectively, a NUMA machine with SMP emulated on top. Thus software that's more NUMAry in operation can be kinder to the Quickpath interconnect between CPUS and the shared L3 caches.

And because CSP is NUMA friendly, it's quite straight forward to scatter CSP proecesses around a network and scale up (though Go and Rust don't AFAIK do this for you). That's a complete no-no with shared memory, semaphores and pthreads.

3
0
Silver badge

Re: Yes, its you.

Um, yah. Rob Pike and Ken Thompson are wee ankle-biters.

Tony Hoare is 20 years older, and is entitled to call a 52 year old Rob Pike a youngster!

3
0
Silver badge

Re: Yes, its you.

"go distributes goroutines across a small number of pthreads (e.g. one per core), and *it* rather than the os decides how optimally to switch between goroutines"

So in other words you have a mini VM doing in-house context switching sitting on top of the OS threading system also doing context switching. That might make the end coders live easier but it doesn't sound efficient from a runtime POV to me. I tend to think the OS should be left to do what its designed to do. You wouldn't write your own filesystem driver or TCP stack inside a language so I'm not sure why language designers think they need to co-opt threading.

1
5
Silver badge

Re: Yes, its you.

"Also you can do some process calculi maths and prove the correctness of a CSP design theoretically"

Thats because CSP is a formal language. But that doesn't help you when you have to map it into a real language.

"What Intel actually give us is, effectively, a NUMA machine with SMP emulated on top"

It doesn't really make a lot of difference what architecture the board & CPU have. You have to go via the OS API unless you plan on having your code talk direct to the hardware running as root/admin which IMO is really not a clever idea or a specially written OS driver.

"And because CSP is NUMA friendly, it's quite straight forward to scatter CSP proecesses around a network and scale up"

Distributed processing is another topic entirely. The overhead is huge and so its really only suitable for large data crunching or search SIMD style programs. Eg: map reduce.

2
1
Silver badge

Re: Yes, its you.

"Rust is beginning to look good because it is suitable as a system's language. There's a whole lot of youngsters at work getting excited about it."

Kids get excitied about some new language or paradagm every month. This time next year they'll have probably moved onto the next hot language which they think will look good on their CVs.

2
1

Re: Yes, its you.

"Is using pthreads REALLY so hard?"

From my experience of reviewing code other people have written that uses pthreads over the last 18 or so years: Yes.

pthreads has the building blocks there, but they're error prone, poorly though out, and use concepts that are too low level to be directly useful.

An example: condition variables. Almost everyone that uses them wants a process-scope semaphore (unix semaphores are an optional part of POSIX iirc, and certainly NOT part of pthreads). Almost everyone that uses gets the implementation details wrong, by either handling the related locking wrong, or not correctly dealing false wakeups, or not holding the lock whilst signaling, etc, etc.

pthread_detach is another example of a tool that's often welded but ends in tears half the time.

The sad fact is, the majority of software developers out their can't use pthreads correctly. When that many get it wrong, the problem is IMHO not the developers.

5
0

Re: Yes, its you.

Go context switches are cheaper than OS context switches. It you have more threads, it's more efficient for them to be virtual.

3
0
Anonymous Coward

And I looked at Python, and came away not believing how anything so naff could actually have any traction whatsoever.

I mean who came up with that dire idea to use whitespace for code logic flow??? What was wrong with curly braces???

8
5
Silver badge

Re: Yes, its you.

@Boltar,

Thats because CSP is a formal language. But that doesn't help you when you have to map it into a real language.

There are plenty of real language CSP implementations. I've written some myself. You design your system, express it algebraically in CSP, do the algebra, prove that the system is correct. You then write source code that implements the same architecture that you've expressed in CSP, compile & run. Go and Rust are but the latest in a long line of CSP implementations, but the advantage is that there's real merit (excellent memory handling, etc) to the languages themselves beyond just their CSPness.

It doesn't really make a lot of difference what architecture the board & CPU have.

No, but Intel and everyone else is really struggling to get improved SMP performance, especially when it comes to large machines. The amount of silicon overhead required to make SMP work well on top of what is fundamentally a NUMA architecture is really quite large, which costs power, money, etc.

The only reason they still do it because of the large amount of pre-existing software that has been written around SMP, including all mainstream operating systems. We're not going to throw those out any time soon.

You have to go via the OS API unless you plan on having your code talk direct to the hardware...

It doesn't work like that. SMP makes all your process memory equally accessible no matter where the memory is and where the process thread(s) are running. You use the OS to handle threads, semaphores, locks. Memory access across QPI or L3 cache does not require OS intervention.

However, for one thread to access memory shared with another in an SMP system, it has to take a semaphore (assuming one is locking shared memory), access the memory (which involves a whole load of data transfers up to L3 cache / across QPI), give the semaphore back. And all the while the separate L1/L2 caches have to be kept coherent, because two (or more) cores are all accessing the same memory address.

CSP, which although it's copying data (rather than sharing it) involves essentially the same amount of work, but this is dressed up as some sort of IPC transaction instead. For example, an OS pipe still involves taking semaphores, accessing memory (in order to copy it), giving semaphores back. The difference is that because CSP copies the data instead of sharing it, there ends up being less cache coherence traffic running across the QPI or up to the L3 cache, because the source and copied data are being accessed by only one thread each.

The fact that CSP implementations are doing this at all stems from the fact that they're sitting on top of a faked SMP machine that is itself sitting on top of and completely obscuring a NUMA architecture. If the SMPness were to be omitted altogether, the CSP would be far more efficient (helped further by being able to omit the silicon that implements the SMP environment). Given a genuine NUMA hardware environment (such as a network of Transputers) CSP is a tremendously good fit.

Distributed processing is another topic entirely.

It is, but it really, really shouldn't be.

Things like ZeroMQ do a wonderful job of completely abstracting away the means by which data is moved in Actor model applications (it nearly does CSP). Intra process, inter process, across a network, nothing that matters changes in your source code. The performance changes, yes, but that's no big deal. You already know that it'll be slower across a network, and can plan one's distribution accordingly. The key thing is that changing the distribution is very little work, there's almost no re-coding to be done.

In contrast, if one has used, say, Rust or Go's CSP mechanisms in-process, and then one decides one wants to distribute processes / threads across a number of machines, you've got a major re-write on one's hands. The CSP channels in those languages don't propagate across network connections AFAIK. Bad karma.

0
0
Silver badge

"

And I looked at Python, and came away not believing how anything so naff could actually have any traction whatsoever.

I mean who came up with that dire idea to use whitespace for code logic flow??? What was wrong with curly braces???"

I have to agree. Something that you cannot see or print or write down on paper matters? Rubbish.

5
0
Silver badge

Re: Yes, its you.

@smartypants,

"Go context switches are cheaper than OS context switches. It you have more threads, it's more efficient for them to be virtual."

Have you ever programmed for OSes like VxWorks? Some of these hard-RT OSes have ultra-low context switch times (kinda the point I guess). Anyway, one tends to get thread-happy on such OSes because the context switch penalty just isn't that much of a deal. I always found it quite liberating!

But I do like the idea of there being only as many native threads as cores with "green threads" (I think that's their technical term) being used within the language, especially on top of Linux / Windows / etc. I wonder if Go can cope with a thread blocking on network I/O?

This is the problem that Python completely failed to address, so they went multi-process as a "terrible kludge". Perhaps that's too harsh - the Python guys have never put it forward as a hard core, high performance system's language (though I reckon there's plenty of people trying to use it that way...).

0
0
Anonymous Coward

I have to agree. Something that you cannot see or print or write down on paper matters? Rubbish.

You have trouble seeing indentation?

1
1

Re: Yes, its you.

> I wonder if Go can cope with a thread blocking on network I/O?

All I/O requests are handled in a special thread which multiplexes all the file-descriptors (with a select/epoll) called the netpoller.

It can become a bottleneck if the program is handling a lot (>100Mbps) of network traffic.

This is the problem I'm having at the moment with some code I've written. I've mitigated it by dedicating an OS thread to each TCP port with runtime.LockOSThread() and using some inline C (or /x/sys/unix) to read the data with direct syscalls.

It's not perfect, but if I was trying to write the app in C I'd still be bashing rocks together.

1
0
Bronze badge
Holmes

Agreed, Python also looks like it has loads of obvious performance traps designed into it, "for ease of use", and no easy way to avoid them, so I'm not at all surprised that high performance applications keep having to be migrated off Python.

Python is also rubbish because it lacks method parameter data type declarations, so blocks static analysis, including usage searchs, automated re-factoring and security checks (!), and will probably block most easy runtime optimisations! Duck typing is always a foul on method interface boundaries!

0
0

Re: Yes, its you.

Did you forget C++17?

0
0
Anonymous Coward

When it doesn't compile because you only have 11 spaces and not 12, then yes. You just want to smash the keyboard.

0
0
Anonymous Coward

Ease of use? Been fighting to get the pip package manager to install on Windows for the last hour. How they can screw that up so badly is beyond me.

0
0
Silver badge

Calling BS

Citing performance is misdirection. This is really because Google doesn't like anything from the real world leaking into its bubble of a parallel universe. Python is slow because of the features it does and doesn't support. Embedding custom sections of C++/Java/Go into Python isn't much of an efficiency boost either because the inputs and outputs are still constrained by Python's design.

A cross compiler allows Google to get rid of CPython and, eventually, create a private version Python that is not foreign matter in Google's parallel universe.

4
5

Re: Calling BS

Why would you want a 'private version of python' when you have Go? If there is anything nefarious here is that Google wants to migrate all python code to go.... Which is not surprising.

5
0
ST
Silver badge
Mushroom

Re: Calling BS

Google is desperately trying to make Go relevant. That's because Go is completely irrelevant to anyone but a small number of Google developers in Mountain View. It's a solution in search of a problem.

Oh, it's fast? Goodie. So is C. Ooooh, C no good to the vast majority of developers born after 1989. Memory management, pointers and leaks.

So, to make Go relevant, Google writes a Python -> Go compiler, then rewrites its public API's in Go - they are Python currently - and there you have it. Now everyone has to learn the new Go API's. That makes Go relevant.

8
7
Silver badge

Re: Calling BS

Python is slow because the core interpreter holds a global lock - the so called GIL.

So any multi-threading, asynchronous processing and _MOST_ _IMPORTANTLY_ garbage collection are done in one running thread.

Recompiling it as GO is not the answer. Getting rid of the GIL at least for garbage collection is.

4
0

Re: Calling BS

We find go very useful in a number of contexts, and we are not Google. Perhaps we're special?

I'd rather dig my eyes out with a protractor than revert to using C for these projects.

And I've been coding long enough to remember when C was the best choice for most stuff we did.

It's common to see people dismissing anything from the last 25 years as not worth bothering with, but it's usually too-broad a brush. Though that IOT hairbrush...

6
0
Anonymous Coward

Re: Calling BS

Not a google developer and I use Go. Currently using it as a backend that serves millions of image views per month, with sizes customised to the users browser, on the fly. I hate to think how long it would have taken to write and get the same level of performance in C/C++ without any memory leaks anywhere. I'm sure it's a percent or two slower than a pure C program, but for the short development time I can cope with that.

4
0
Silver badge

Re: Calling BS

"I hate to think how long it would have taken to write and get the same level of performance in C/C++ without any memory leaks anywhere"

Of course you're assuming Go itself doesn't have any bugs that lead to memory leaks or similar issues? Don't think thats likely? Check out the problems Java has had over the years.

2
2
Silver badge

Re: Calling BS

Surely if converting to Go and then compiling produces a faster result then the problem isn't Python the language, merely Python the implementation? In this use case Go is merely an intermediate code. Compiling Python directly into LLVM IR might have been the route less insistent on throwing Google's own language in _somewhere_?

1
0
Silver badge

Re: Calling BS

"I'm sure it's a percent or two slower than a pure C program, but for the short development time I can cope with that."

As a performance specialist, it's a shame I can only upvote this once. >95% of performance issues can be improved by improving the code (data architecture, etc), and the sooner you've got the code working the sooner you can start. When you get down to could I use a faster language you frequently aren't far from the making the choice of just using faster hardware just as appealing, if not more so.

8
0

Re: Calling BS

Boltar I think you miss the point.

If you rely on a memory manager, the scope for memory problems shrinks dramatically. Even very good developers routinely leave all sorts of memory issues in their apps, requiring lots of effort to identify the screw-ups. Relying instead on a memory manager doesn't eradicate the possibility of error, but the issues rarely are to do with the memory manager, and instead are to do with failing to clean up references or creating large amounts of garbage when you could avoid it.

Perhaps you're that rare developer who can write perfect leak-free code every time. Lucky you!

The main downside to using a memory manager is the GC sweeps, which introduces occasional latencies which might ruin your signal processing app. For 99% of applications, a memory manager's performance impact is utterly irrelevant.

This article gives a good 3rd party account of the state of play with Go and GC:

https://blog.pusher.com/golangs-real-time-gc-in-theory-and-practice/

3
0
ST
Silver badge
Mushroom

Re: Calling BS

> We find go very useful in a number of contexts, and we are not Google. Perhaps we're special?

Not really. You can't write C. Nothing special about that.

2
5

Re: Calling BS

Memory leakage has never been a problem for C++ programmers unless they use continue to write C in C++.

3
4
Orv
Silver badge

Re: Calling BS

I'd wager the odds of someone noticing and fixing a memory leak in the Go implementation are quite a bit higher than me noticing an odd edge case in my own code.

I'm reminded of the debate over using libraries that happened when the first big zlib bug hit. Yeah, programs that linked to zlib were vulnerable, but upgrading the library on a system fixed them all. The programs that didn't link to zlib mostly cut-and-pasted the zlib code, sometimes with their own twists, and all of those had to be found and fixed individually.

2
0
Silver badge

Re: Calling BS

A cross compiler allows Google to get rid of CPython and, eventually, create a private version Python that is not foreign matter in Google's parallel universe.

This is some of the biggest horse shit I've come across in a while. Google does loads of stuff in Python and actively contributes to lots of products. They wanted to use it for systems stuff but hit problems with the GIL and Go was a reasonable solution for some stuff. A good developer keeps an open mind and Google is keen on good developers.

I've not used Go myself but I know plenty of developers who are comfortable switching between Go and Python depending on the task in hand. Go's builtin support for concurrency and parallelism is fantastic for some situations, though not for all, as Ben Bangert's talk on Python to Go and back makes clear. In many situations being able to use PyPy will solve most performance problems and this sounds like an extension of the idea where better parallelism is required.

Meantime async.io is getting traction and Python is also looking at ways of alleviating problems associated with the GIL.

1
1
Silver badge

Re: Calling BS @J H Woods

But as soon as you have the code working the PHB will put it live. Its very important than nothing works until everything does.

0
0
ST
Silver badge
Mushroom

Re: Calling BS

> Even very good developers routinely leave all sorts of memory issues in their apps

No they don't. Those who do aren't good developers.

Developers who can write decent C or C++ use Valgrind and a whole collection of code sanitizer libraries. Evidently you've never heard of these tools, so Go Google now.

Pun intended.

1
1

Re: Calling BS

Re: Valgrind...

So what you're saying is that even good developers can't be trusted to write sound code with C or C++ without pushing it through tools which point out all the mistakes they've made.

...and this isn't a criticism of the language?

---

I first used such tooling back in the 1990s (a tool called 'Purify' - still going I see) to debug memory leaks in the first large system we wrote in C++ (An image-processing application). This was before the STL was a thing and early versions of C++ were pre-compiled down to C using 'cfront' compilers.

It was brilliant, and it found loads of issues our team had left in this project (and our other released products). Though we were glad of this tool, we had a great team, and we did reflect on the difficulty of using a language which made it so easy to get things wrong.

I don't think we were alone. Many modern languages address this simple truth with a better set of tools less littered with tripwires.

Today, using go, the things we build generally have a very stable memory footprint and there tend to be far fewer issues to resolve in terms of memory. When we need additional checks, the go toolset has pretty much all we need. The thing we rely on most is not memory checks but deadlock detection. The -race compiler flag highlights race conditions, and you can either apply that to tests or bake into your compiled app. It's just one of a large number of brilliant practical things about the go toolset - go's appeal goes a lot further than just the basic language.

0
0
Trollface

Only fools and horses

Delboy at work at google these days?

0
0

When you open a can of worms you'll need a bigger can to get them all back in again

"The transcompiled code is not suitable for working with directly."

That's how to create stiffware or keep maintaining unsupported Python 2.7 code forever. We used to have an in-house report generator written in assembly code which was still very useful but nobody knew how to maintain it.

"That said, there is the possibility of rewriting bits and pieces in Go (eg, performance-critical stuff) and then call into it from Python. Sort of a hybrid approach."

You are in a maze of twisty passages all alike.

4
0
Facepalm

Python is the new COBOL?

Having just ended a Decade of COBOL development and maintenance - you remember? the language that 'died' by 1970? (not) Having now picked up C# (not by choice) with quite a struggle. Here comes Google to hold back another generation of software developers to maintain their Python 2.7 codebase which is going to be an archaic curiosity to most in 10 years time. Leaving a bunch of devs with xp in a dead language having had little time or opportunity in the work environment to learn up to date skills. Let's just hope they don't open up this transcompiler making it a widespread issue.

3
0
Orv
Silver badge

Re: Python is the new COBOL?

Hey now, projects that need programmers for dead languages are important. How else are you going to get work in this field after you turn 40? Even if you learn the latest and greatest language, they won't believe you and will hire some 22-year-old instead.

2
0
Thumb Up

Re: Python is the new COBOL?

Fair comparison. Python was specifically designed to improve upon "beginner languages" like BASIC and COBOL, with little regard for "serious language" concerns like performance, security, or long-term maintainability. After 3 decades of evolution, it's neither. I still find it useful for smaller-scale scripting but that's mainly a reflection on the popularity of other crap languages.

Python's poor performance is NOT just an implementation issue, it's rooted in design choices and hidden complexity. There are faster implementations but they're complex, and compatibility is questionable at best. PyPy took 10 years to become more-or-less usable. Why bother? "Don't reinvent the wheel" was their excuse. Freakin' programmers.

1
0
Silver badge

Just port the bloody code.

Typical.

"Shall we go through the original Python code-base and re-implement it in Go, testing as we go?

NAH! Fuck that. Where's the cool in that? Nah. Let's write a Python to Go converter! I mean, hell, the Go code that the converter spits out will be terrible and un-readable/un-maintainable to humans, but... well, you know... it's a freaking Python to Go converter... How cool is that???!!!!

Yes, that's what we'll do.

Can I put it on my CV yet?"

Programmers. FFS.

5
2
Orv
Silver badge

Re: Just port the bloody code.

Yeah, I admit, as cool as Go might be, having a Python -> Go -> Native toolchain seems like adding an unnecessary step.

2
0
Silver badge
FAIL

Re: Just port the bloody code.

Shall we go through the original Python code-base and re-implement it in Go, testing as we go?

Because computes make fewer mistakes than humans? "Transpiling", particularly with Python, is well understood, reliable and fast as the PyPy project documents.

0
0

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

The Register - Independent news and views for the tech community. Part of Situation Publishing