Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Comprehensive Python Cheatsheet (2018) (gto76.github.io)
335 points by pizzaburek on Feb 4, 2019 | hide | past | favorite | 81 comments


I like the basic script template that the author provides.

However, when I start development of new code I always use a script that starts off with traceback and pdb, something like this:

    #!/usr/bin/env python

    import traceback
    import pdb
    import sys

    def main():
        # some WIP code that maybe raises an exception
        raise BaseException("oh no, exception!")
        return 0

    if __name__ == "__main__":
        try:
            ret = main()
        except:
            traceback.print_exc()
            pdb.post_mortem()
        sys.exit(ret)
This means that whenever an uncaught exception gets raised, I immediately get told what happened (full backtrace) plus I get dumped into the debugger, from where I can inspect why this happened. I liberally sprinkle assert()s through my code, so this gives me a good edit-run-debug cycle to work with.


This is pretty smart. When I'm making one-off scripts I use the interactive flag

python -i script.py

I've programmed Python for years and never knew this. I'm not sure how well this handles errors. Tab complete works, only thing that is missing is needing to wrap help(<func>) to see signatures. I'm enjoying the standard lib interpreter.


I use ipython for similar task. It has wrapper for 'help' function ('?' symbol after function name), tab completion and syntax highlighting. Also it has special %hist command for easy copy-paste code.


I like IPython unless working with large arrays, because of weird memory issues I experienced (maybe they're fixed?). I miss the %run, !<cd, ls, pip>, %timeit magic and the IPython.terminal.debugger.set_trace().

If you're on linux it's not _that_ bad without it since the shell sort of acts like the notebook (with an equally horrible markup language). The help() and tab-complete and _ __ ___ for last 3 values works, time python <file>.py can benchmark, -i or pdb.set_trace does the debugging. It doesn't keep a persistent history though so the workflow is sort of different.

So I use both, particularly for plots and images knowing IPython is very helpful for working with the notebooks.


Is it just me or is dict#setdefualt a terribly named function? I just learned of it here, and I was excited that it would set the default value for the dictionary, when instead it performs something like:

  (self, key, default) => self[key] = (key in self ? self[key] : default)
I'd prefer something like "dict#createifnotexists", but better.



I'm aware of defaultdict, but when I saw that method on dict I got excited because I thought it meant I didn't need defaultdict anymore.


Or get_or_create, to follow Django's example.


There are several items that are dubious at best. Why would you use

    no_duplicates    = list(dict.fromkeys(<list>))
instead of

    no_duplicates = list(set(<list>))

?


   >>> list(dict.fromkeys([3, 2, 1]))
   [3, 2, 1]
   >>> list(set([3, 2, 1]))
   [1, 2, 3]


Dictionaries should not be assumed to keep their orderings anyway.

That's why there's OrderredDict.

Unless the language explicitly says that dicts hold their order, even if they do, it's an implementation detail, and it should not be relied upon.


As of 3.7 (3.6 for cpython), dictionaries have a deterministic order.


What's the best way to write code that relies on this deterministic order and prevent it from running incorrectly with older python versions?


Import collections.OrderedDict as you would have done beforehand. There are still some differences between them.

(Dicts in 3.7 and ordered dicts)

https://stackoverflow.com/questions/50872498/will-ordereddic...


If "the language explicitly says" something, then it's part of the specification and not an implementation detail.


Sure, hence the "unless" I've used.


I think this behavior will only be working starting from Python 3.6, as the dictionary obtained by calling `dict.fromkeys()` before this would not keep the ordering [1]

[1] https://youtu.be/p33CVV29OG8?t=489


I love that talk. Raymond Hettinger is a gem. What I took from that is the implementation does preserve order but it isn't guaranteed yet and in theory it could change.


> I love that talk. Raymond Hettinger is a gem.

This talk is definitely worth a watch yes :)

> What I took from that is the implementation does preserve order but it isn't guaranteed yet and in theory it could change.

This was the case for python 3.6, but starting from Python 3.7 ordering is guaranteed [1]:

  Changed in version 3.7: Dictionary order is guaranteed to be insertion order.
[1] https://docs.python.org/3.7/library/stdtypes.html#dict-views


> Raymond Hettinger is a gem.

His talks are good, except that every time he asks a stupid rhetorical question of the audience I’m overwhelmed with the urge to wang¹ a tomato at him.

1. https://www.gunnerkrigg.com/?p=330


Neither dict nor set preserve order. This is a misleading snippet.


In Python 3.6+, dictionaries preserve insertion order. This is done by storing the keys, values (and cached hash values) in a separate array, and the hashtable is a succinct array of indexes into this array. This results in more compact dictionaries, which are also a bit faster because of it's cache friendliness. Preserving insertion order is a happy side effect of this. This optimization can also be applied to Python sets, but it hasn't been done yet.


More importantly, in Python 3.7 this exception has become the rule, i.e. it was introduced in the language specification


I am personally not a fan of details like this creeping into the language spec. CPython is not the only Python. This kind of spec creates unnecessary challenges for Micropython, for example. And of course there is Cython and others. It makes no sense to me that this should be in the language spec — you are essentially specifying a built in minefield of implementation bugs.


Cpython took this implementation detail from pypy (so it was already going to be the case in the two most used python implementations).

The reason being that the ordered dict ends up being faster than non-ordered, and people will rely on this implementation detail, so they added it to the spec to make that okay.


Well I'll be dipped.


Very cool!


I'm not sure what you're trying to imply. If it's regarding the order, you're wrong. The keys returned by either `keys()` or by `__iter__` are returned in arbitrary order [1].

[1] https://docs.python.org/2/tutorial/datastructures.html#dicti...


The whole cheatsheet is clearly in Python 3, where the same doc page that you linked states:

   Performing list(d) on a dictionary returns a list of all the keys used in the dictionary, in 
   insertion order ...


only specified as such since Python 3.7, which hadn't been released when this cheat sheet was written, at least if the date at the top is accurate.

EDIT: clicking through to GitHub, the date is clearly not up-to-date, so this might have been added later.


They both choke on lists that contain lists.


list(dict.fromkeys(<list>)) preserves ordering, whilst list(set(<list>)) doesn't.


Can we please change the current clickbait title (Best Python Cheatsheet Ever!) to the original one (Comprehensive Python Cheatsheet)?


Agreed, although I'll note here that this was posted to /r/python under the clickbait version of the title, so the HN submitter was likely just passing it on from there.


    if __name__ == '__main__':
        main()
is better as

    if __name__ == '__main__':
        sys.exit( main() )


For those wondering why, like I did, here's why: https://stackoverflow.com/questions/5280203/what-does-this-m...


Very good page. But the best python cheat sheet is help(the_thing). python has the second best ever help and discoverability (after Matlab of course!)


But. to do `help(thing)` you need to know what the _thing_ is.

With Python, many times I vaguely know something but dunno what it is exactly. This cheatsheet solves that!


Is there some way to list all modules available for importing? Some times you don't know which modules are available to ask the help for it.


>>> help('modules')


Related post on my blog:

Get names and types of a Python module's attributes:

https://jugad2.blogspot.com/2016/10/get-names-and-types-of-p...

and the same recipe on ActiveState Code (from where you can download the Python code for the recipe):

https://code.activestate.com/recipes/580705-get-names-and-ty...


You can call help on anything. Here I am asking for help on something I have no clue about:

  >>> from server.db import db
  >>> help(db)
  help on Client in module google.cloud.firestore_v1beta1.client object:
class Client(google.cloud.client.ClientWithProject) | Client(project=None, credentials=None, database='(default)') | | Client for interacting with Google Cloud Firestore API. | | .. note:: | | Since the Cloud Firestore API requires the gRPC transport, no | ``_http`` argument is accepted by this class. ...


I think legends 2k means you may not know the name of a function you need to use in something you are trying to implement.

Example: Suppose you didn't know that re was what you needed for regular expression, or suppose you forgot how iterators work etc.


Sure, that's why it's #2 on "2 Hard things in CS". Should probably be #1.


Yes! And implementing this help text for your library is free. It just assembles the docstrings found in your code. And then your favourite text editor can use this for autocomplete and such as well.

Treating docstrings as an actual reflectable part of a class/function and not just "comments to be ignored by a parser" is brilliant.


I would put it after R help function and, even better, F1 key to get the full documentation of any thing in Rstudio.


I'm amazed you think the R documentation is anything but terrible.

Compare the R vector page: https://stat.ethz.ch/R-manual/R-devel/library/base/html/vect...

To the equivalent python documentation: https://docs.python.org/3/tutorial/datastructures.html


The documentation could clearly be prettier but its strength lies in its standardization : I can get method arguments and examples for all functions of all packages in the same format and with a single keystroke.


I could not help but comment to add Elixir to the list of languages with amazing discoverability. For things in the standard library, you essentially get a condensed version of Python web documentation, with examples!


What to do `help(thing)` you need to know what the _thing_ is.

With Python, many times I vaguely know something but dunno what it is exactly. This cheatsheet solves that!


Seems to me that perhaps this should be a Show HN since it's submitted by the creator.

Also, for me the main link won't actually display and I can't be bothered to track down why, but the Github source (https://github.com/gto76/python-cheatsheet) displays just fine.


Nice, but why does this needs JS to be rendered ?


Because it's rendered directly from README.md. That way it's easier for me, because I don't have to render it every time I make a change (I make a lot of little edits all the time), and project's Github page (https://github.com/gto76/python-cheatsheet) always has the same content as webpage.


It's quite annoying to have to enable JS for 3 different domains just to see something other than a blank page. It also prevents non-JS browsers and retrievers from seeing content.

If you were to use e.g. org-mode for the source document, you could easily export to HTML automatically when the file is saved.

You could also easily use a git hook to run e.g. pandoc to convert md to HTML automatically.

There are many ways of automatically exporting HTML when you save the md source file. Please use one.


Will look into it, thanks for info.


Thanks for being open to the idea (unlike the downvoters). HN is so tiresome.


I would also encourage the author and others to export static documents to HTML. As a user, I really appreciate it.


Why not use static site generator and re-run on content update?


Thanks.


I like the clean look and the overall way it is presented. The examples using angle brackets (eg.: <list>.append(<el>) ) is highly legible and the one column format is faster to scan.


I came here to say exactly the opposite. The angle brackets mean that every single example is a syntax error.

If you are going to write a cheatsheet with samples of Python code, why not write it in valid Python?

This is clearly aimed at beginners as the samples are or very basic things, yet this is going to be confusing to them as they will think that `<list>` is syntax.

Have a look at the Python docs for much better ways of doing this


I think the author of this document is trying to differentiate between types (broad notions of course -- there's no type called 'num' in Python).

The Python docs (example here[1]) on the other hand limit themselves to {}, [] and () to differentiate between dicts, lists and tuples, but doesn't really differentiate between these and the more generalized notions of collections/iterators/element, etc.

As someone who codes in statically typed languages and has an intuitive sense of upcasting / downcasting / contravariance / covariance, this level of precision makes sense to me and enhances my appreciation for Python types.

That said, I've also been writing Python professionally since 2005 so I'm comfortable with not worrying too much about types in Python -- with dynamic typing things just work as long as they have the right shape and behavior.

[1] https://docs.python.org/3/library/threading.html#threading.T...


But he could achieve exactly the same thing by using descriptive variable names without the angle brackets. Admittedly he can't use `list` as it's a built-in but `a_list` or `list_` would be much better than `<list>`.


My personal opinion is that `list_` is among the worst possible variable names (together with `bar` and `foo`). It's non-descriptive, ugly, hard to recognize, looks scary to beginners and maybe worst of all it's easy to forget the underscore when typing.


That's exactly what he's used, but with angle brackets around it. What do you think he should have done instead.


Good point. For list, one could still use the name "lst" and get the point across.


Now, this I understand. I have to sit down and translate it to Rust :-)


Something magical happens when you have a one page reference for a language or tool. I just end up learning it much faster.

For me this is the best quick reference for Python (2.7 unfortunately): http://rgruet.free.fr/PQR27/PQR2.7.html


So to get a list of unique values you suggest I convert to a dictionary and then convert back to a list? Is this the Python way? I dunno it seems like acrobatics

  no_duplicates = list(dict.fromkeys(<list>))


Easier:

  no_duplicates = set(<list>)


Except that won't return a list. You need list(set(<list>)). Also using set means you lose list ordering, which isn't the case when using dict (assuming python 3.6 or higher).

That being said, and not that it should ever make a difference, list(set(<list>)) is 2-3 times faster than list(dict.fromkeys(<list>))


This is quite useful as I don't always remember syntax.

Are there any available for Ruby or Linux?


For Linux, from the same author: https://gto76.github.io/linux-cheatsheet/


Empty set literal: {*()}


Considering it has the same number of characters as the idiomatic set(), its only value is in curiosity as a special case of collection unpacking.


   MIND = BLOWN


I like your linux cheat sheet too. Thanks for sharing!


Is there a cheatsheet for cython to go with this ?


In the section "Inline -> Comprehension", isn't this:

    <iter> = (i+5 for i in range(10))         # (5, 6, ..., 14)
really a generator expression?


Yes, although there is a saying that "every generator is an iterator, but not every iterator is a generator". But then again generator has a bunch of methods (close, gi_frame, gi_yieldfrom, throw, gi_code, gi_running, send) that iterators don't have... I really don't know if <iter> is correct enough here, or should I use <genr> (that I don't use anywhere else and could be confusing).


I would've written:

<iter> = ( f(i) for i in <gen>)

where f() is some function of i.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: