Categories
python sequence slice

Understanding slicing

4294

I need a good explanation (references are a plus) on Python slicing.

0

    6072

    The syntax is:

    a[start:stop]  # items start through stop-1
    a[start:]      # items start through the rest of the array
    a[:stop]       # items from the beginning through stop-1
    a[:]           # a copy of the whole array
    

    There is also the step value, which can be used with any of the above:

    a[start:stop:step] # start through not past stop, by step
    

    The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

    The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

    a[-1]    # last item in the array
    a[-2:]   # last two items in the array
    a[:-2]   # everything except the last two items
    

    Similarly, step may be a negative number:

    a[::-1]    # all items in the array, reversed
    a[1::-1]   # the first two items, reversed
    a[:-3:-1]  # the last two items, reversed
    a[-3::-1]  # everything except the last two items, reversed
    

    Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

    Relationship with the slice object

    A slice object can represent a slicing operation, i.e.:

    a[start:stop:step]
    

    is equivalent to:

    a[slice(start, stop, step)]
    

    Slice objects also behave slightly differently depending on the number of arguments, similarly to range(), i.e. both slice(stop) and slice(start, stop[, step]) are supported.
    To skip specifying a given argument, one might use None, so that e.g. a[start:] is equivalent to a[slice(start, None)] or a[::-1] is equivalent to a[slice(None, None, -1)].

    While the :-based notation is very helpful for simple slicing, the explicit use of slice() objects simplifies the programmatic generation of slicing.

    10

    • 180

      Slicing builtin types returns a copy but that’s not universal. Notably, slicing NumPy arrays returns a view that shares memory with the original.

      Sep 23, 2013 at 0:13


    • 124

      This is a beautiful answer with the votes to prove it, but it misses one thing: you can substitute None for any of the empty spaces. For example [None:None] makes a whole copy. This is useful when you need to specify the end of the range using a variable and need to include the last item.

      Jan 16, 2019 at 18:49

    • 9

      Note that contrary to usual Python slices (see above), in Pandas Dataframes both the start and the stop are included when present in the index. For further info see the Pandas indexing documentation.

      – vreyespue

      May 29, 2019 at 12:54


    • 26

      What really annoys me is that python says that when you don’t set the start and the end, they default to 0 and the length of sequence. So, in theory, when you use “abcdef”[::-1] it should be transformed to “abcdef”[0:6:-1], but these two expressions does not get the same output. I feel that something is missing in python documentation since the creation of the language.

      Jun 30, 2019 at 14:00

    • 27

      And I know that “abcdef”[::-1] is transformed to “abcdef”[6:-7:-1], so, the best way to explain would be: let len be the length of the sequence. If step is positive, the defaults for start and end are 0 and len. Else if step is negative, the defaults for start and end are len and –len – 1.

      Jun 30, 2019 at 14:22


    656

    The Python tutorial talks about it (scroll down a bit until you get to the part about slicing).

    The ASCII art diagram is helpful too for remembering how slices work:

     +---+---+---+---+---+---+
     | P | y | t | h | o | n |
     +---+---+---+---+---+---+
     0   1   2   3   4   5   6
    -6  -5  -4  -3  -2  -1
    

    One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n.

    4

    • 33

      This suggestion works for positive stride, but does not for a negative stride. From the diagram, I expect a[-4,-6,-1] to be yP but it is ty. What always work is to think in characters or slots and use indexing as a half-open interval — right-open if positive stride, left-open if negative stride.

      – aguadopd

      May 27, 2019 at 20:05


    • But there’s no way to collapse to an empty set starting from the end (like x[:0] does when starting from the beginning), so you have to special-case small arrays. :/

      – endolith

      Jul 6, 2019 at 20:07


    • 2

      @aguadopd You are absolutely right. The solution is to have the indices shifted to the right, centered just below the characters, and notice that the stop is always excluded. See another response just below.

      Apr 5, 2021 at 21:32


    • 1

      Addendum to my comment: see my answer with diagrams below: stackoverflow.com/a/56332104/2343869

      – aguadopd

      Apr 15, 2021 at 1:04

    493

    Enumerating the possibilities allowed by the grammar for the sequence x:

    >>> x[:]                # [x[0],   x[1],          ..., x[-1]    ]
    >>> x[low:]             # [x[low], x[low+1],      ..., x[-1]    ]
    >>> x[:high]            # [x[0],   x[1],          ..., x[high-1]]
    >>> x[low:high]         # [x[low], x[low+1],      ..., x[high-1]]
    >>> x[::stride]         # [x[0],   x[stride],     ..., x[-1]    ]
    >>> x[low::stride]      # [x[low], x[low+stride], ..., x[-1]    ]
    >>> x[:high:stride]     # [x[0],   x[stride],     ..., x[high-1]]
    >>> x[low:high:stride]  # [x[low], x[low+stride], ..., x[high-1]]
    

    Of course, if (high-low)%stride != 0, then the end point will be a little lower than high-1.

    If stride is negative, the ordering is changed a bit since we’re counting down:

    >>> x[::-stride]        # [x[-1],   x[-1-stride],   ..., x[0]    ]
    >>> x[high::-stride]    # [x[high], x[high-stride], ..., x[0]    ]
    >>> x[:low:-stride]     # [x[-1],   x[-1-stride],   ..., x[low+1]]
    >>> x[high:low:-stride] # [x[high], x[high-stride], ..., x[low+1]]
    

    Extended slicing (with commas and ellipses) are mostly used only by special data structures (like NumPy); the basic sequences don’t support them.

    >>> class slicee:
    ...     def __getitem__(self, item):
    ...         return repr(item)
    ...
    >>> slicee()[0, 1:2, ::5, ...]
    '(0, slice(1, 2, None), slice(None, None, 5), Ellipsis)'
    

    4

    • Actually there is still something left out e.g. if I type ‘apple'[4:-4:-1] I get ‘elp’, python is translating the -4 to a 1 maybe?

      – liyuan

      Jan 1, 2018 at 16:39


    • note that backticks are deprecated in favour of repr

      – wjandrea

      Jan 27, 2019 at 1:36

    • @liyuan The type implementing __getitem__ is; your example is equivalent to apple[slice(4, -4, -1)].

      – chepner

      Sep 10, 2019 at 14:26

    • The first two tables are pure gold.

      – Bananeen

      Dec 20, 2021 at 4:16