A comprehensive guide to permutations for beginners

January 9, 2023 · 38 min · 7928 words · nor

This post is intended to be an introduction to permutations for beginners, as there seem to be no resources other than cryptic (for beginners) editorials that talk about some cycles/composition, and people unfamiliar with permutations are left wondering what those things are.

We'll break the post into three major parts based on how we most commonly look at permutations. Of course, even though we will treat these approaches separately (though not so much for the last two approaches), in many problems, you might need to use multiple approaches at the same time and maybe even something completely new. Usually, you'd use these ideas in conjunction with something like greedy/DP after you realize the underlying setup.

An ordering of the sections in decreasing order of importance according to the current permutations meta would be: cycle decomposition, permutation composition, ordering.

However, the topics are introduced in a way so that things make more sense, because there are some dependencies in definitions.

Table of contents

  1. Definition, basic properties, and notation
    • Definition
    • Two-line notation
    • Permutation invariants
  2. The ordering perspective
    • Fixed points
    • Derangements
    • Inversions
    • Longest increasing subsequences, and Erdos Szekeres theorem
    • Next permutation
    • Random permutations
  3. The cycle decomposition perspective
    • Cycles
    • Cycle notation
    • Finding kk-th next, functional graphs
    • Swaps/transpositions
    • Swaps on cycles
  4. The permutation composition perspective
    • Definition
    • Examples --- cycles and swaps
    • Parity of permutations
    • Parity under composition
    • Inverse of permutations
    • Involutions
    • kk-th power of permutations
    • Order of permutations
    • Square-root of permutations
    • Conjugation and conjugacy classes
  5. Miscellaneous topics (brief discussion)
    • Group theory
    • Counting: Burnside, Polya enumeration, Stirling numbers
    • Coordinate compression
    • Connected component DP
    • Young tableaus
    • Schreier Sims algorithm
    • Permutation tree
  6. Problems

Definition, basic properties, and notation

To start with, here's the definition of a permutation that you might find in many problems on Codeforces:

Definition: A permutation of size nn is an array of size nn where each integer from 11 to nn appears exactly once.

But why do we care about these arrays/sequences? The reason behind this is simple.

Consider any sequence aa of nn integers. We can index them with integers from 11 to nn. So the ii-th integer in the sequence is aia_i. To understand the following, it helps to consider the example where ai=ia_i = i for all ii.

Let's now shuffle this sequence (elements are allowed to stay at their original places). We will try to construct a permutation from this shuffling in two ways:

  1. What is the new index of the element that was initially at position ii? (In the case when ai=ia_i = i, this just means finding the position of ii in the new array). Let's say this index was f(i)f(i). Then ff is a function from {1,2,,n}\{1, 2, \dots, n\} to itself. Now notice that when we write out the array [f(1),f(2),,f(n)][f(1), f(2), \dots, f(n)], this satisfies the definition of a permutation! And we can easily construct a shuffling operation corresponding to every permutation just by reversing this process.
  2. What was the old index of the element that is now at position ii? (In the case when ai=ia_i = i, this just means looking at position ii in the new array and reporting the number there). Let's say this index is g(i)g(i). Then gg is again a function from {1,2,,n}\{1, 2, \dots, n\} to itself. When we write out the array [g(1),g(2),,g(n)][g(1), g(2), \dots, g(n)], this is again a permutation! Note that ff and gg are not defined the same way, and the generated permutations are, in fact, distinct. In fact, it should not be hard to see that f(g(x))=g(f(x))=xf(g(x)) = g(f(x)) = x for any xx in {1,2,,n}\{1, 2, \dots, n\}.

This means shuffling around elements (or permuting them, in standard terminology) is another way to think about permutations.

This shows us how permutations are related to functions from {1,2,,n}\{1, 2, \dots, n\} to itself such that all images are distinct (and all elements have a unique pre-image). In other words, a permutation can be thought of simply as a bijection on a set, which is the most natural definition to use for quite a few applications. The second interpretation is what corresponds to this definition, and the thing in the first definition is what we sometimes call the position array of the permutation in the second definition (this name is not standard). We will later see that it is what is called the inverse of the permutation.

Note that the above explanation is quite important for having an intuition on what a permutation is. If you don't understand some parts, I recommend re-reading this and trying a few simple examples until you're comfortable with what the permutation and functions ff and gg have to do with each other.

Two-line notation

Note that a permutation [a1,a2,,an][a_1, a_2, \dots, a_n] of [1,2,,n][1, 2, \dots, n] corresponds to a function ff on {1,2,,n}\{1, 2, \dots, n\} defined by f(i)=aif(i) = a_i. Implicitly speaking, the set of pairs (i,ai)(i, a_i) uniquely determines the permutation (it is just the set of mappings from position to element) and vice versa. To understand compositions and inverses (which will be introduced later on) better, the following notation can come in handy:

Here we have just arranged the pairs vertically, from left to right. Note that these pairs can be shuffled around (like dominoes), and this can lead to some nice properties:

(12na1a2an)\begin{pmatrix} 1 & 2 & \cdots & n \\ a_1 & a_2 & \cdots & a_n \end{pmatrix}

  1. What happens when we invert the pairs (that is, replace (i,ai)(i, a_i) by (ai,i)(a_i, i) for all ii, which corresponds to swapping the two rows)? If the shown permutation corresponds to the function gg, the permutation after inverting the pairs corresponds to the function ff! This should be clear by looking at how ff and gg were defined earlier. So this sort of operation leads to a function inverse.
  2. What happens if we try to combine two permutations by trying to match the lower half of the first permutation p1p_1 with the upper half of the second permutation p2p_2? If the gg for p1p_1 is g1g_1, and that for p2p_2 is g2g_2, then it can be verified that the new permutation's gg is just the function formed by applying g1g_1, then g2g_2. So this sort of operation leads to function composition.

Permutation invariants

Note that when a permutation is sorted, it leads to [1,2,,n][1, 2, \dots, n]. So any accumulation operation on the permutation array (like sum, product, xor, sum of squares, number of odd integers, etc.) that doesn't rely on the order of elements in an array will give the same value for all permutations of the same length.

The ordering perspective

Fixed points

A fixed point of a permutation aa is an index ii such that ai=ia_i = i. These are essentially the values that are not affected by the permutation at all, so if we disregard what happens to these values, we won't be losing out on any information for that permutation aa. This is also why these ii-s are sometimes skipped in the two-line notation (and also in the cycle notation, which will be introduced in the next section).

Note: If you are familiar with cycles or are on your second reading, you might note that this idea is probably more appropriate for cycles and compositions, but I kept it in this section since it shares some ideas with other subsections in this section. The same goes for the next section.

Derangements

A derangement is a permutation with no fixed points. That is, for every ii, we have aiia_i \ne i. One useful thing to know is how many of the n!n! permutations of size nn are derangements. Let's say this number is DnD_n. There are at least three distinct ways to count them:

  • Solving linear equations

    In this approach, we group all possible permutations by the number of their fixed points. If there are ii fixed points, then upon relabeling the remaining nin - i elements from 11 to nin - i, the resulting permutation is a derangement. So, the number of permutations on nn elements with ii fixed points is exactly (ni)Dni\binom{n}{i} \cdot D_{n - i}.

    Summing this over all ii gives us the identity n!=i=0n(ni)Dnin! = \sum_{i = 0}^n \binom{n}{i} \cdot D_{n - i}.

    Along with the fact that D0=1D_0 = 1 and D1=0D_1 = 0, we can use induction to show that Dn=n!i=0n(1)ii!D_n = n! \cdot \sum_{i = 0}^n \frac{(-1)^i}{i!}. We can also use the following identity to come up with the same result:

    Special case of Mobius inversion

    If there are functions ff and gg from 0\mathbb{Z}_{\ge 0} to \mathbb{R}, then the following two statements are equivalent:

    1. g(n)=i=0n(1)i(ni)f(i)g(n) = \sum_{i = 0}^n (-1)^i \binom{n}{i} f(i)
    2. f(n)=i=0n(1)i(ni)g(i)f(n) = \sum_{i = 0}^n (-1)^i \binom{n}{i} g(i)
  • Recursion

    Let's count the number of derangements using a recursion. Suppose an=kna_n = k \ne n. We look at aka_k. If aka_k is nn, then there are Dn2D_{n - 2} ways of assigning the remaining numbers to the permutation. If aka_k is not nn, then we note that this reduces to the number of derangements on n1n - 1 numbers, by renaming nn (to be assigned in the remaining n2n - 2 slots) to kk.

    So the recurrence becomes Dn=(n1)(Dn1+Dn2)D_n = (n - 1) \cdot (D_{n - 1} + D_{n - 2}), and this can be solved using induction too.

  • Using the principle of inclusion and exclusion

    We can use an inclusion-exclusion argument to solve this problem too. Let SiS_i be the set of permutations that leave the ii-th element fixed (the remaining are free to be permuted).

    Then we need n!|iSi|n! - |\cup_i S_i|. Using the inclusion-exclusion principle, since the intersection of any kk different SiS_i's has size (nk)!(n - k)!, we again get the same expression for DnD_n.

Inversions

An inversion in an array (not necessarily a permutation) is any pair of indices (i,j)(i, j) such that i<ji < j and ai>aja_i > a_j. What is the minimum number of inversions? It is 0, because you can just enforce ai=ia_i = i. What is the maximum number of inversions? It is n(n1)/2n(n - 1)/2 because you can just enforce ai=ni+1a_i = n - i + 1, and then every unordered pair of distinct indices corresponds to an inversion.

Let's consider the following task: For a given permutation (or an array with distinct elements), you are allowed to do the following operation any number of times:

  • Pick any 1i<n1 \le i < n and swap aia_i and ai+1a_{i + 1}.

  • Is it possible to sort this array using these operations only?

  • If yes, what is the minimum number of operations if you want to sort this array using these operations only?

Note that if it is possible for a permutation, it is also possible for any array with distinct elements (we can just replace the elements in the array by their ranks when sorting in increasing order).

The answer to the first part is yes. The main idea is to pick the smallest element and keep swapping it with the previous element until it reaches the first position. Then do the same thing recursively for [2,,n][2, \dots, n].

Let's think about the number of operations here. How many operations do we do in the first phase? Note that everything coming before 11 in the original permutation contributed to an inversion where the second element was 11. Now after 11 is in its intended place, there will be no more inversions that are related to 11 in any way. For the rest of the permutation, we can do this argument recursively. But for that, we need to see what happens to inversions where 11 was not involved. Note that the relative order of no other pair changes, so the other types of inversions remain the same! So, the total number of operations is clearly the number of inversions in the permutation.

Is this the minimum number of operations you need to sort the array? The answer turns out to be yes. Consider any operation that is done. Since it flips the order of two adjacent things, there is at most one inversion it can reduce. So the decrease in the number of inversions is at most 11. In a sorted array, there are no inversions. So the number of operations must be at least the number of inversions in the original permutation, and thus we are done.

Now you might think --- how do we actually count the number of inversions in a permutation? The answer is you can do that using merge sort or by some point-update-range-sum data structure like Fenwick tree or segment tree. The merge sort solution can be found here, and for the data structure solution, consider the following:

  • Initially, you have an array AA of size nn filled with 00 (on which the data structure will be constructed).
  • For each ii starting from 11 to nn, find the sum of the prefix of AA of size ai1a_i - 1 (that is, sum of AjA_j for 0j<ai0 \le j < a_i), and add it to the answer. Increment AiA_i by 11.

Longest increasing subsequences and Erdos Szekeres theorem

An increasing subsequence of an array aa (not necessarily a permutation) is a sequence of indices i1<i2<<iki_1 < i_2 < \dots < i_k such that ai1<ai2<<aika_{i_1} < a_{i_2} < \dots < a_{i_k}. Decreasing subsequences are defined similarly.

An algorithm to find a longest increasing (or, with a few modifications, non-decreasing) subsequence of an array can be found in this nice video tutorial. But that is not what we are concerned with at the moment.

We are concerned with bounds on the length of any longest increasing subsequence (LIS from here on). However, for decreasing subsequences, an LIS has length 11. The Erdos Szekeres theorem tells us that in such cases, the length of the longest decreasing subsequence will be large.

More formally, the theorem states that in any permutation (or array with distinct elements) of size at least xy+1xy + 1, there is either an increasing subsequence of length x+1x + 1 or a decreasing subsequence of length y+1y + 1.

The easiest way to prove this theorem is via an application of the Pigeonhole principle.

Suppose for the sake of contradiction that the theorem is false for some permutation aa. For every ii, consider the length of a longest increasing subsequence ending at index ii and the length of a longest decreasing subsequence ending at index ii. Let's call these numbers xix_i and yiy_i. Note that all xix_i-s are integers between 11 and xx, and all yiy_i-s are integers between 11 and yy. So there can be at most xyxy distinct pairs (xi,yi)(x_i, y_i). By the Pigeonhole principle, there are i<ji < j such that (xi,yi)=(xj,yj)(x_i, y_i) = (x_j, y_j). Since all elements are distinct, ai<aja_i < a_j or ai>aja_i > a_j. In the first case, it is impossible that xi=xjx_i = x_j, and in the latter, it is impossible that yi=yjy_i = y_j. This is a contradiction, and we are done.

A more sophisticated and deeper proof of this theorem can be done using Dilworth's theorem. This blog uses it to prove a special case of this theorem, though the proof can be modified easily to work for the complete proof too.

Next permutation

Note that permutations are just sequences of integers, so it is possible to sort the set of all possible sequences of size nn lexicographically (i.e., in the order you would find words in a dictionary). This defines a natural indexing of each permutation. How do we find the next permutation from a given permutation?

The easiest way (in C++) is to use std::next_permutation, but we'll briefly describe how it works.

Let's find the first index ii from the right so that ai>ai1a_i > a_{i - 1}. Since ii was the first index satisfying this condition, all indices ii to nn must form a decreasing sequence. Note that the smallest number in this sequence that is larger than aia_i will be the new element at position ii, and the rest of them (along with the current aia_i) will be sorted in increasing order after it. All of this can be implemented in time O(ni+1)O(n - i + 1).

Note that starting from the lexicographically smallest permutation (which is [1,2,,n][1, 2, \dots, n]), the number of permutations between (both inclusive) this permutation and a permutation whose first position ii such that aiia_i \ne i is kk, is at least (nk)!+1(n - k)! + 1. This means that if you apply next_permutation even a large number of times, the number of elements in the permutation that will ever change will not be large (unless you start from a specific permutation, and even in that case, apart from a single change for potentially all indices, the same conclusion holds).

So if for a permutation of length nn, you do the next_permutation operation (as implemented above) O(nk)O(n^k) times, the time taken will be (much faster than) O(knklogn)O(k n^k \log n). You can even bound the number of operations to perform next_permutation rr times by O(r+n)O(r + n). The analysis is similar to the complexity analysis when you increment the binary representation of an nn-bit number rr times.

Random permutations

There are a total of n!n! permutations of size nn. How do we construct a random permutation if all we have is a random number generator that can give us integers in a range we can choose?

For thinking about how we can incrementally construct such a permutation, let's try to remove all elements >k> k for some kk. Note that the position of kk in this array is equally likely to be any integer from 11 to kk. However, this won't lead to a very efficient algorithm right away.

We would want to rather construct a random permutation from left to right. Suppose we have a prefix chosen already. Note that the permutations of all elements on the right are equally likely. Also, all elements on the right are equally likely to be the first element of the suffix (i.e., the last element of the just-larger prefix). This observation leads to Durstenfeld's variation of the Fisher-Yates shuffle.

Now let's think about some statistics of random permutations.

Firstly, what is the expected length of the greedily picked increasing sequence by this algorithm:

  • If there is nothing in the sequence or the new element is greater than the last picked element, pick this element.

Note that an element is picked if and only if it is more than everything before it. That is, we can do the following using linearity of expectation:

E[l]=E[i(ai chosen)]=iP[ai chosen]=iP[ai>max(a1,,ai1)]=i1iE[l] = E[\sum_i(a_i \text{ chosen})] = \sum_i P[a_i \text{ chosen}] = \sum_i P[a_i > \max(a_1, \dots, a_{i - 1})] = \sum_i \frac{1}{i}, which is the harmonic sum, and is approximately lnn\ln n.

However, it can be shown that the expected length of the LIS is Θ(n)\Theta(\sqrt{n}), so a greedy approach is much worse than computing the LIS systematically.

For more random permutation statistics that also have some information on statistics derived from the next few sections, I would refer you to this.

The cycle decomposition perspective

We will now switch gears and note a very important part of the theory of permutations --- the cycle decomposition. You will find yourself referring to this quite a lot while solving problems on permutations.

Cycles

Suppose we have a permutation aa. Let's fix some ii. Let's try to look at the sequence ii, aia_i, aaia_{a_i} and so on. Eventually, it must repeat because there are at most nn values that this sequence can take. For the sake of convenience, let's call the jj-th element of this sequence bjb_j.

Then for some k<lk < l, we have bk=blb_k = b_l. Let's take kk to be the smallest such index, and let ll be the smallest such index for that corresponding kk. If kk is not 11, then we must have bk1=bl1b_{k - 1} = b_{l - 1}, because aa is a bijection. However, this contradicts the minimality of kk! This means that the first element that repeats in the sequence is, in fact, ii.

Now suppose something in the sequence between 11 and ll repeats (let's say at positions 1<m<o<l1 < m < o < l). Then by repeating the bijection argument m1m - 1 times, we must have bom+1=b1=ib_{o - m + 1} = b_1 = i, so om+1lo - m + 1 \ge l. But this is a contradiction. This means that the sequence repeats starting from ll and forms a cycle.

We call this a cycle because if we made the graph where there was a directed edge from ii to aia_i, then this ii would be in a cycle of length l1l - 1.

Now, this was done for a single ii. Doing this for all ii means that all elements are in a cycle in this graph. Clearly, since the indegree and outdegree of each ii are 11, this means that these cycles are all disjoint.

Let's take an example at this point. Consider the permutation [2,3,1,5,4][2, 3, 1, 5, 4]. The cycles for each element are as follows:

  • 12311 \to 2 \to 3 \to 1
  • 23122 \to 3 \to 1 \to 2
  • 31233 \to 1 \to 2 \to 3
  • 4544 \to 5 \to 4
  • 5455 \to 4 \to 5

These correspond to the cycles 12311 \to 2 \to 3 \to 1 and 4544 \to 5 \to 4 in the graph.

Cycle notation

Note that these cycles are independent. That is, we can just say that for these cycles, any element outside the cycle is irrelevant. So in this sense, for any permutation, we can just list out its cycles and we will be able to determine the permutation uniquely.

So we just represent a permutation as a list of its cycles. For example, the permutation [2,3,1,5,4][2, 3, 1, 5, 4] can be represented as (123)(45)(1 2 3)(4 5).

Note: there is a deeper meaning behind this notation that is related to composition and how disjoint cycles commute for composition.

Finding kk-th next, functional graphs

Consider the following task: for a permutation, compute the value of aaia_{a_{\ddots_{i}}} for each ii where there are kk aa-s, k1018k \le 10^{18}.

Note that if you can find cycles, you can do a cyclic shift for each cycle (appropriately computed modulo the cycle length) and give the answer.

What if it was not guaranteed that aa is a permutation, but all elements in aa are in [1,n][1, n] nevertheless? The cycle argument breaks down here, but you can show the following fact:

  • Each weakly connected component consists of two parts: a directed cycle and directed trees rooted at distinct vertices in the cycle, directed towards the root (also called reverse-oriented arborescences).

The same problem can be solved in this setting, too, by using binary lifting for each vertex till it reaches the "main" cycle of its component.

In the language of graph theory, such graphs are called functional graphs (which have outdegree of each vertex = 1).

Some problems to practice on are Planet Queries (I and II) on CSES.

Swaps/transpositions

Let's start from the array [1,2,,n][1, 2, \dots, n] and try to apply swaps on this array till we reach some desired permutation a=[a1,a2,,an]a = [a_1, a_2, \dots, a_n]. Note that we can always apply such swaps, by trying to build the permutation from left to right. These swaps are also called transpositions.

Swaps on cycles

Let's now think about what happens when we apply a swap to a permutation. Suppose the swap swaps elements at positions ii and jj. Let's look at the graph associated with this permutation. In this graph, swapping two elements at positions xx and zz is equivalent to breaking two edges xyx \to y and zwz \to w and making two edges xwx \to w and zyz \to y (in something like a crossing manner).

We make two cases:

  1. ii and jj are in different cycles: this joins the two cycles together.
  2. ii and jj are in the same cycle: this breaks the common cycle into two.

A relevant problem at this point would be 1768D.

Let's consider another permutation sorting problem, but here rather than just adjacent elements being swapped, we allow swapping any elements. What is the minimum number of swaps to sort a permutation this way?

Note that in the final result, there are exactly nn cycles (singletons). Let's say we currently have cc cycles. Since the number of cycles increases by at most 1 each time, there must be at least ncn - c swaps to sort this permutation.

Is this achievable? Yes. We can keep swapping elements in the same cycle while there is a cycle of length at least 22, and this will sort the permutation.

The permutation composition perspective

Definition

Suppose we have two permutations aa and bb of the same size nn. Now we want to define a composition of them as follows: abab is defined as the array cc where ci=abic_i = a_{b_{i}}.

This essentially corresponds to finding the composition of the gg-functions of aa and bb, with gg for bb being applied first.

Let's get an intuitive sense of what it does, by getting some information out of the definition. Let's first deal with the case when bb is the "identity" permutation: bi=ib_i = i. Then c=ac = a according to this definition. Similarly, if aa is the identity permutation, then c=bc = b. So, composing with the identity permutation in any order gives the original permutation back.

Now let's understand this by using the two-line notation. The idea is to take the two-line notation for both permutations and do some sort of pattern matching to get to their composition.

Consider the following permutations aa and bb.

(12na1a2an)\begin{pmatrix} 1 & 2 & \cdots & n \\ a_1 & a_2 & \cdots & a_n \end{pmatrix}

and

(12nb1b2bn)\begin{pmatrix} 1 & 2 & \cdots & n \\ b_1 & b_2 & \cdots & b_n \end{pmatrix}

Let's reorder the columns of the two-line notation for aa so that the lower row of bb and the upper row of aa match. However, this corresponds to shuffling the columns of aa according to bb:

(b1b2bnab1ab2abn)\begin{pmatrix} b_1 & b_2 & \cdots & b_n \\ a_{b_1} & a_{b_2} & \cdots & a_{b_n} \end{pmatrix}

Note that this is still a valid two-line notation for aa. Now, if we "merge" this with the two-line notation for bb, this gives us the two-line notation for the composition of aa and bb.

Note how this idea of composition arises naturally from the interpretation of a permutation as a bijection on a set.

We start with the identity permutation, apply the first permutation to it, and then apply the second permutation to it. Applying means replacing xaxx \mapsto a_x. The two-line notation for a permutation aa has the following property:

  • If the permutation in the first row is pp, then the permutation in the second row is apap.

Examples --- cycles and swaps

Note that we can associate a cycle itself with a permutation, where the non-cycle elements are fixed points, and the remaining are mapped according to the cycle. In this sense, cycles are permutations.

Also note that a swap itself is a permutation in the same manner. In fact, a swap is just a cycle of length 2.

It is helpful to play around with a few examples of swaps and cycles and write them in two-line notation to understand their mappings and try composing them.

At this point, you should also try to recall the following two things:

  1. When we introduced cycles, we wrote them in a certain format. In fact, you can compose disjoint cycles without worrying about the order in which they are composed. This should be obvious from the fact that they are independent in a way (an element that gets moved in one permutation is a fixed point of the other). You can also see that the notation (123)(45)(1 2 3)(4 5) also conveniently captures that this permutation is a composition of the cycles (123)(1 2 3) and (45)(4 5).

  2. When we discussed swaps, we noted that you could "apply" swaps. This corresponds to left-multiplying the permutation with the corresponding swap permutation. It is instructive to come up with a few examples of swaps and how they compose at this point.

It is important to note that (by using the fact that function composition is associative) permutation composition is associative.

Let's say we have a permutation where we came to by applying the following three permutations: (12)(12), (312)(43)(312)(43) and (53241)(53241) in this order, we would write it as (53241)(312)(43)(12)(53241)(312)(43)(12). Note that here (12)(12) and (43)(43) are cycles of length 22, i.e., they are swaps.

Now how do we reduce this back to a normal permutation? One way would be to write out the permutation for each cycle and then keep composing it. A cleaner way of doing the same thing would be to do this: for every element, start from the right, and replace it with what the cycle maps it to. For instance, if we want to find the image of 11, the first cycle maps 11 to 22, the second cycle maps 22 to 22, the third cycle maps 22 to 33, and the fourth cycle maps 33 to 22.

Parity of permutations

Recall that we mentioned earlier that an adjacent swap reduces the number of inversions by at most 11. Well, to be more precise, it changes the number of inversions by ±1\pm 1, and hence flips the parity of the number of inversions of the permutation.

We define the parity of the permutation to be the parity of the number of inversions of the permutation.

The reason why this is important is that it gives a handle on the parity of swaps (not just adjacent swaps) as well as some information about the cycle parities, as follows.

Let's consider a swap that swaps elements at positions i<ji < j. Then a way to get to this would be to apply jij - i adjacent swaps to bring aja_j to position ii, and ji1j - i - 1 adjacent swaps to bring aia_i (from its new position) to position jj. This takes an odd number of total adjacent swaps, so the final number of inversions is also flipped.

As a corollary, applying any swap changes the parity of the permutation, and in particular, all swap permutations have odd parity (parities add up modulo 22 over composition, as can be seen easily).

Now note that for a cycle, we can construct it by doing a few swaps. More specifically, if the cycle is (i1,i2,,ik)(i_1, i_2, \dots, i_k), then we can do k1k - 1 swaps to apply the same transformation as the cycle permutation.

So all even-sized cycles have odd parity, and all odd-sized cycles have even parity.

Let's consider any decomposition of a permutation into (possibly non-disjoint) cycles. As a result of the above discussion, the parity of the permutation is odd iff the number of even-sized cycles is odd.

In particular, for a decomposition of a permutation into swaps, the parity of the permutation is the same as the parity of the number of swaps in the decomposition. Note that this also implies that we can't have two decompositions with an odd difference in the number of swaps.

Rephrasing the above result, we have the fact that if it takes an odd number of swaps to get to a permutation, the permutation has odd parity and vice versa.

Parity under composition

Clearly, from the above discussion, it can be seen that parities add modulo 22 over composition. This, however, is important enough to warrant its own section because it summarizes all of the discussion above.

Inverse of permutations

Firstly, let's think about permutation in an application sense. Is there a way to get back the original permutation after some permutation has been applied to it? From our remarks on the two-line notation, it suffices to do this for the initial permutation being the identity permutation. Let's say the applied permutation was aa:

(12na1a2an)\begin{pmatrix} 1 & 2 & \cdots & n \\ a_1 & a_2 & \cdots & a_n \end{pmatrix}

By our remarks on composition using the two-line notation, we need something that swaps these two rows as follows:

(a1a2an12n)\begin{pmatrix} a_1 & a_2 & \cdots & a_n\\ 1 & 2 & \cdots & n \end{pmatrix}

Note that this is easy: consider the position array pp of aa as we defined earlier; i.e., pip_i is the position where aa equals ii, i.e., api=ia_{p_i} = i. Note that aa is aia_i at ii, so paip_{a_i} is the position where aa is aia_i, i.e., pai=ip_{a_i} = i. This shows that pa=appa = ap, and this common value is the identity permutation (which we shall denote by id\text{id} in what follows).

In other words, pp and aa are inverses of each other under composition. It is trivial to note that aa is the position array of pp as well. We denote pp by a1a^{-1}.

Now that we have some intuition about the inverse in terms of the two-line notation, we think about it in terms of cycles, swaps, and compositions:

  1. The inverse of a cycle (i1,i2,,ik)(i_1, i_2, \dots, i_k) is simply (ik,,i2,i1)(i_k, \dots, i_2, i_1). This is done by reversing all the edges in the related directed graph.
  2. The inverse of a swap is itself. This also follows from the fact that a swap is a 2-cycle.
  3. The inverse of a composition abab is b1a1b^{-1}a^{-1}. This corresponds to undoing the permutation application on a stack and also makes algebraic sense: abb1a1=aa1=id=b1b=b1a1ababb^{-1}a^{-1} = aa^{-1} = \text{id} = b^{-1}b = b^{-1}a^{-1}ab.

In particular, for a decomposition of a permutation into (not necessarily disjoint) cycles, we can just take a mirror image of the decomposition, and it will correspond to the inverse of the permutation!

Also, the parity of the inverse is the same as the parity of the original permutation.

Involutions

An involution is a permutation whose inverse is itself. Consider the disjoint cycle decomposition of the permutation a=c1cka = c_1 \dots c_k. a=a1a = a^{-1} means id=a2=ici2\text{id} = a^2 = \prod_i c_i^2. Note that we are able to write this because disjoint cycles commute, and so do identical cycles. ci2c_i^2 should be an identity map on the elements of cic_i because else the resulting permutation won't be id\text{id}. This means that cic_i is either of size 11 or 22. Conversely, it can be checked that these permutations are involutions.

kk-th power of permutations

Since permutation composition is associative, we can use binary exponentiation to compute the permutation. A more mathematical way to think about it is to use the same trick as in the previous subsection. With definitions the same as in the previous section, for any aa, ak=icika^k = \prod_i c_i^k, so the cycle structure is determined by the powers of the cycles.

Finding the kk-th power of a cycle just corresponds to mapping it to its kk-th next neighbor in the cycle. If the length of the cycle is cc, and g=gcd(c,k)g = \gcd(c, k), then the disjoint cycle decomposition of ckc^k consists of gg cycles, with each of them corresponding to a stride of kk along the original cycle.

Order of permutations

The order of a permutation aa is defined as the least kk such that aka^k is the identity permutation. Let's again look at the cycle decomposition as in the previous subsection. For a cycle cc of length ll to be "dissolved" into singletons, it is necessary and sufficient to have the power of the permutation divisible by ll. Hence, the order of the permutation is just the LCM of all cycle lengths.

Square-root of permutations

For this section, we will consider the following problem: 612E. The solution is left as an exercise to the reader (it shouldn't be hard, considering what we have discussed in the last few sections).

Conjugation and conjugacy classes

Suppose a,ba, b are two permutations. We think about the following permutation: aba1aba^{-1}. If we think about how the two-line notation for these permutations combines, we can note that this is just the permutation whose cycle decomposition is almost the same as bb, but aa is applied to each element in the cycles. In other words, we "reinterpret" the cycles by mapping to and from a separate "ground set" via permutation aa.

For a formal statement, let the cycle decomposition of bb be c1ckc_1 \cdots c_k, where ci=(xi1,,xili)c_i = (x_{i1}, \dots, x_{il_i}). We claim that aba1aba^{-1} is c1ckc'_1 \cdots c'_k where ci=(axi1,,axili)c'_i = (a_{x_{i1}}, \dots, a_{x_{il_i}}).

The proof formalizes the earlier argument. Consider an element ii, and we look at what happens to ii as each permutation is applied to it:

  1. aba1aba^{-1}: There is a jj such that aj=ia_j = i since aa is a permutation. So (aba1)i=abai1=abj(aba^{-1})_i = a_{b_{a^{-1}_i}} = a_{b_j}.
  2. c1ckc'_1 \cdots c'_k: Without loss of generality, let's say ii is the first element of c1c'_1, i.e., i=axi1i = a_{x_{i1}}. Then j=xi1j = x_{i1}. Now it is mapped to axi2=abxi1=abja_{x_{i2}} = a_{b_{x_{i1}}} = a_{b_j}.

Since both of them give the same result, we are done.

This says that the operation aba1aba^{-1} is nothing but a translation of the labels of bb according to aa.

Note that by this result, since we are just applying the permutation to everything inside the cycles, we can map any product of disjoint cycles to any other product of disjoint cycles, provided the multiset of cycle sizes doesn't change.

Let's call bb and cc to be conjugate if there is an aa such that c=aba1c = aba^{-1}, and call this operation conjugation.

Thus this leads to a partition of all permutations into equivalence classes (after verifying a couple of axioms), where everything inside a class is conjugate to everything else in the class. Note that any equivalence class is uniquely determined by the multiset of cycle sizes. These equivalence classes are called conjugacy classes.

Miscellaneous topics

This section will mostly be about some very brief references and is intended to tell people about some topics they might not have seen or heard about before, which are relevant to this discussion. There might be posts in the future regarding these topics, but it's still a good idea to read up on these yourself.

Group theory

The set of permutations of size nn forms what is called a group and is denoted by SnS_n, and it turns out that all finite groups of size nn are isomorphic to some subgroup of SnS_n. This is why there has been a lot of work on permutations from a group theoretic perspective. The section on permutation composition is best viewed under a group theoretic lens.

The analysis of a lot of games can be done using group theory, for instance, 15-game, Rubik's cube, Latin squares, and so on.

There is a subgroup of permutations that corresponds to all even permutations. This group is also known to be simple and can be used to show results about the 15-game, for example.

Counting: Burnside, Polya enumeration, Stirling numbers

Using group theoretic ideas, the Burnside lemma and the Polya enumeration theorem come into the picture, and they help to count classes of objects (equivalent under some relation) rather than objects themselves.

Stirling numbers are quite useful when counting things related to permutations, and they deserve a whole post for themselves.

Coordinate compression

Note that in the above discussions, especially in the section on ordering, we used arrays of distinct elements and permutations interchangeably. Whenever something only depends on the ,<,>,,=\le, <, >, \ge, = relations between the elements and not the elements themselves, it makes sense to replace elements with their rank. In the case of distinct elements, this "compressed version" becomes a permutation.

Connected component DP

Just like the failed idea in the generation of permutations, that idea can also be used for dp. More details can be found in this post.

Young tableaus

Closely related to the LIS and other ordering concepts is the idea of Young tableaus. This has a very interesting theory, and this post is a good reference for some of the results.

Schreier Sims algorithm

Suppose you have a subgroup generated by some permutations, and you want to find the size of the subgroup. The Schreier Sims algorithm helps in finding this size, and a good post can be found here.

Permutation tree

There is a cool and interesting data structure that can do different kinds of queries on permutations and subarrays. This post is a good introduction to the data structure and what it can do.

Problems

I have added some standard problems whenever suitable, but for practicing permutations well, it makes sense to practice on a larger set of problems. Since there is a very large number of problems on permutations, maybe just adding a few of them that I know would not do justice to the variety of problems that can be made on permutations. A couple of possible methods of practicing are the following:

  1. Search on the CF problemset for problems with the name permutation in them.
  2. Look out for permutation problems on sources for standard problems like CSES.
Cite this post
@online{permutations-for-beginners-2023,
  author    = {nor},
  title     = {A comprehensive guide to permutations for beginners},
  year      = {2023},
  month     = {01},
  day       = {09},
  url       = {https://nor-blog.pages.dev/posts/2023-01-09-permutations-for-beginners},
}