nav search
Data Center Software Security Transformation DevOps Business Personal Tech Science Emergent Tech Bootnotes BOFH

back to article
Python explosion blamed on pandas

Silver badge
Happy

Hmm

So, pandering to the audience.

20
0

Re: Hmm

I see what you did there.

1
2
Silver badge

Re: Hmm

BA DUMP BA!

0
0
Anonymous Coward

How did the pandas get explosives?

0
0

> How did the pandas get explosives?

From the guerrillas, duh.

30
0
Silver badge
Boffin

Execution speed...

when you play with big quantities of data in science, the speed is usually limited by inefficient code, not by inherent properties of the language. When I crunch my 5 GB dataset, making a for loop a little faster won't make my code run in a reasonable time -- but moving to a sparse data representation or avoiding the loop altogether will. Python makes those things easy, that's why it is a game changer for science.

8
1
Anonymous Coward

Re: Execution speed...

Python makes those things easy, that's why it is a game changer for science.

When talking about science, it is best to avoid hyperbole and exaggeration. Is python a sometimes convenient tool? Yes. Is it helpful to have iin some situations? Absolutely. Is it a "game changer"? Hell, no.

New algorithms for data processing and representation, or new models to analyse that data, or new theories to explain it, or new experimental techniques to measure it can be a critical new development, or a "game changer" if you will. A computer language which allows you to quickly slap together a bit of code neither your colleagues nor yourself will understand in two months time? Not so much.

4
3
Silver badge

Re: Execution speed...

Absolutely. Is it a "game changer"? Hell, no.

You should probably talk to more scientists. Python has become popular among a huge range of scientists with no formal computing qualifications who are required to process large amounts of data. I've met several who would never have got their work done without Python. So, yes, for some it really is a game changer.

10
1
Boffin

Re: Execution speed...

is it a "game changer"?

For me absolutely, for at least a dozen reasons, including resource management, the ability to use it interactively and the fact that you can easily interface to C (or other low level language) libraries. Although it's probably true that there's nothing you can do in python that couldn't be done in C, and the C would ultimately run faster, the development time in python is orders of magnitude faster, which in a scientific, especially research, context is far more important. Consider, for exanple, some real world data set which could be represented by nested dictionaries in which keys can either be real numbers or strings. This is trivial in python and can be taught to science students without a programming background, doing it in C would probably be a 2nd year undergraduate programming exercise.

Yes, there are potential drawbacks to Python, but in my view they are mainly social (untrained people tend to try run before they can walk), not technical

9
0

Re: Execution speed...

Absolutely. I work as a scientific developer at a university and we've begun offering Python bootcamps because so many researchers want to learn it.

3
0

Re: Execution speed...

You could make that argument for Excel/Access/VBA. But I'd rather you didn't.

At some point it all ends up in front of an experienced programmer as a pile of novice code, a huge problem, a short deadline and requirements of "I can't quite get it working, will you take a look?"

It's got its merits but, like everything else in this industry for the past umpteen years, all the breathless hyperbole is a bit of a turn off.

2
1

Re: Execution speed...

@ BeakUpBottom (At some point ...)

That is an example of the social problem I mentioned. When I teach short course to non-programmers (for very secific tasks), the first thing I say is along the lines of "you will be learning to use a programming language, but this is NOT going to turn you into programmers" (and repeat several times thereafter!)

1
0
Silver badge

Re: Execution speed...

You could make that argument for Excel/Access/VBA. But I'd rather you didn't.

I know no one who prefers VB(A) over Python and lots of people who've moved from VB(A) to Python and have embraced it wholeheartedly, especially as some of us work hard to make it possible to work with MS Office files without having to start Word or Excel.

While in Python you can't simply record a macro to get something done, it's a good example of nearly literate programming. Most new users are keen to right good code and respond well to suggested improvements and I've almost never come across unreadable newbie code. I know some people hate the whitespace but it makes a real difference in these environments.

VBA on the other hand has access to some fantastic API but as a programming language is akin to self-harm.

2
0
tfb
Boffin

Re: Execution speed...

I used to think this was true as well, but it's not. if you are dealing with large quantities of numerical data (and 5GB is not a large quantity in this sense: our jobs create terabytes a day) then having something which implements various numerical array-bashing operations efficiently does actually matter. Hence NumPy.

1
0
Silver badge

Re: Execution speed...

Hence NumPy.… and numbas and pypy, etc. Python has always followed the principle of avoiding premature optimisations and the libraries allow us to continue it.

0
0

Re: Execution speed...

There are millions of EUCs in excel/VBA that never need to get in front of an "experienced programmer".

Python is filling a similar niche for people with more specialised number crunching requirements. In terms of hyperbole it's solving a smaller set of problems than the spreadsheet but it's solving them exceptionally well...

This is actually a super-happy story in that we've actually managed to grow some combination of language and tooling that people want to use!

0
0
Silver badge

Re: Execution speed...

@AC:

" When talking about science, it is best to avoid hyperbole and exaggeration. Is python a sometimes convenient tool? Yes. Is it helpful to have iin some situations? Absolutely. Is it a "game changer"? Hell, no. "

That's every bit as strong a statement as the one you're seeking to refute.

In the early days of computer programming, most people were just scheduling batch jobs (hence "programming") using a scripting language.

The problem is, most shell scripting languages are rubbish. Most attempts at more powerful shell scripting languages (e.g. Tcl) were contorted, byzantine affairs. Javascript was clumsy to start off with, and when people tried to put it into the shell, if just felt weird.

What is often overlooked is that Python is a shell scripting language, and it manages to maintain a pretty high level of flexibility and power while still being more learner-friendly than most languages.

When people complain about its lack of speed, they're kind of missing the point, because in applications like data science, all the heavy lifting is done by libraries, which are generally compiled C code.

Python with Pandas is a bit like a massively updated version of using calling grep from a bash script.

It has changed the game.

0
0
Silver badge

Re: Execution speed...

> What is often overlooked is that Python is a shell scripting language,

Python is a computer language. The most common implementation can be used as a 'shell scripting language', or as an application programming language, or as a statement evaluation tool. Other implementations can be used as an embedded language or can compile to various VMs and/or can use JIT compilation.

0
0

approachable for novice programmers?

So how exactly do you install pandas? Every time I want to try out python, I do not know which version to use, I just hope that random pip command du jour will work and nothing will break. Should I use Python version which came with Mac? Or the one I installed with Homebrew? Where do the packages reside?

0
0
Silver badge
Happy

Re: approachable for novice programmers?

"Where do the packages reside?"

China -Where the Panda repositories are.

10
0
Silver badge

Re: approachable for novice programmers?

Python's packaging remains a problem. However, in general you should avoid installing user libraries for a system language.

Personally, I create a separate virtual environment for every Python project an install the required libraries only there. However, when it comes to Pandas you can also install Anaconda (from the maintainers of Pandas) which comes with its own package manager for a set of well-maintained and pre-compiled libraries.

4
0
Anonymous Coward

Re: approachable for novice programmers?

"So how exactly do you install pandas?"

For engineers such as myself, this is one of the biggest headaches with Python. Compared to more engineering-focused ecosystems package and dependency management are a pain in the arse.

However there's also a pretty simple answer that satisfies most use cases.

Step 1: Install anaconda

Step 2: Set up a virtual environment per project

Step 3: Distribute that virtualenv as a docker container

Job done.

0
0
Silver badge
Coat

Re: approachable for novice programmers?

"So how exactly do you install pandas?"

Typically one gets a zoo on board about 2 to 3 years ahead of schedule, and arranges government funding to build a specialist panda enclosure, and then one writes up an agreement with the Chinese Government and the panda breeding associations. From what I've been reading of late it costs between $85,000USD to $1.1Million USD a month to host the pandas. Most of the money is supposed to go back to the breeding and protection of the species, but I've no proof of that.

<ref TorStar article "Pandas Installed at Toronto Zoo over objections from (Free speech advocacy group) >

1
0
Silver badge

"It's fun (for a programming language)

It's readable

It has lots of libraries

It's approachable for novice programmers"

.

alt.sysadmin.recovery always had a very useful motto: "All hardware sucks. All software sucks. They all suck the same."

As applicable here, all programming languages suck. 'Fun' is an orthogonal concept.

Libraries, people, documentation - that's the package that makes progress possible in any particular language. The language is a circumstance.

6
1
Anonymous Coward

It is the masochistic kind of fun

It is the masochistic kind of fun. BDSM. All that matters is your pain threshold - how painful do you want it to be before you enjoy it.

Python is for people with low pain threshold.

Javascript and C - same but for those who enjoy the quantity - more beatings, more fun. Just light ones every time.

Perl is for the really kinky ones - it does not hurt a lot, but hurts in some really weird places.

Java, C++ - for those of us who need to have a glimpse of the light eternal before they get a kick out of it. A good equivalent would be - BSDM fans of strangulation.

10
1
Silver badge
Mushroom

Python explosion blamed on pandas

Obviously.

Put a panda into a python and it will explode.

viz. -->

7
0
Silver badge
Holmes

Re: Python explosion blamed on pandas

Well, there are pics on Da Intarwebz of a python explosion caused by a crocodile, but none by pandas. So I am disinclined to accept the statement posited above as true.

2
0
Unhappy

Re: Python explosion blamed on pandas

Small pandas are OK. If the python gets greedy, then bad things happen.

0
0
Silver badge
Stop

Why not Fortran?

If it can't be done in Fortran, it ain't worth doing...

3
0
Silver badge
Boffin

Re: Why not Fortran?

Real Programmers can write FORTRAN in any language.

And Real Real Programmers can write assembler in FORTRAN.

4
0

Ansible for systems management, Pandas for small data analysis, PySpark for big data analysis, Notebooks for presentation and encapsulation.

Python: actually pretty nice to use for almost everyone.

0
0

Notebooks in the Azure Cloud

Microsoft offers free Jupyter notebooks in the Azure Cloud at notebooks.azure.com for those interested in investigating Python notebooks. There are also two very basic Python courses from Microsoft on edX suitable for the rank beginner that use the notebooks.

There are free Jupyter notebooks for the Julia language at juliabox.com for those interested in Julia, and there is a Coursera course on Julia that assumes you know other languages.The objective of Julia is to provide the ease of use of Python, R, and Matlab while running as fast as C or Fortran. See juliacomputing.com

0
1
Thumb Up

OFF-BY-ONE

Off by one.... in a binary system... Oh, well.

0
0

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

The Register - Independent news and views for the tech community. Part of Situation Publishing