IUSACO
Input and Output
#include <cstdio>
using namespace std;
int main() {
freopen("template.in", "r", stdin);
freopen("template.out", "w", stdout);
}
- When using C++, arrays should be declared globally, or initialized to zeros if declared locally to avoid garbage values.
- 32bit int: \(\pm 2\times10^{9}\) v/s 64bit int: \(\pm 9\times 10^{18}\)
Complexity and algorithm analysis
- Elementary mathematical calculation: O(1)
- Unordered set/map: O(1) per operation
- Binary Search: O(log n)
- Ordered set/map or Priority Queue: O(log n) per operation
- Prime factorization or primality check for int: \(O(\sqrt{n})\)
- Reading n inputs: O(n)
- Iterating through n element array: O(n)
- Sorting: Usually O(n log n) for
std::sort()
- Iterating through all subsets of size k of input elements: O(\(n^{k}\) ), for triplets O(\(n^{3}\))
- Iterating through all subsets: O(\(2^{n}\))
- Iterating through all permutations: O(n!)
Built-in Data Structures
Data Structure determines how data is stored, each supports some operations efficiently. In following discussion, desired data type is put between <>
. Most std structures support size()
and empty()
methods.
Iterators
Allows for traversal of a container with the help of a pointer.
for (vector<int>::iterator it = myvector.begin(); it != myvector.end(); ++it) {
cout << *it; //prints the values in the vector using the pointer
}
Alternate way to achieve the same with a for-each loop and auto
.
for(auto element : v){
cout << element; // prints values in vector
}
Dynamic Arrays
Addition and deletion at the end in O(1) time and in the middle in O(n) time.
vector<int> v;
for(int i = 1; i <= 10; i++){
v.push_back(i); // stores 1 to 10 in a dynamic array
}
Vectors can be made static sized by initializing it with a size, vector<int> v(30);
. They also support an v.erase()
operation. A dynamic array can be sorted (default ascending) by sort(v.begin(), v.end())
.
Stacks and Queues
Stacks: LIFO with operations push
(add at end), pop
(remove at end) and top
(show end) all of which are O(1). Declared as stack<int> s
.
Queues: FIFO with operations push
(add in front), pop
(remove at end) and front
(show end) in O(1).
Deques: Combination of a stack and a queue supporting insertion and deletion from both front and end. Operations are aptly named as push_back
, push_font
, pop_back
and pop_front
.
Priority Queues: Supports insertion of elements and deletion and retrieval of element with highest priority in O(log n) where priority is based on a comparator function (highest element in front). Has push
(add at end), pop
(remove at end) and top
(show end) operations and is declared as priority_queue<int> pq;
.
Sets
A set is a collection of objects having no duplicates.
Unordered Sets: Work by hashing that is, assigning a unique code to every object allowing for insert
, erase
and count
(set contains element then 1 else 0) in O(1). Traversal is pointless. Declared as unordered_set<int> s
.
for(int element : s){
cout << element << " "; // iterating through a set, arbitrary order
}
Ordered Sets: Insertion, deletion and search needs O(log n) time. Has additional operations begin()
(iterator to lowest element), end()
, lower_bound()
(iterator to least element ≥ some k) and upper_bound()
.
Multisets: A sorted set allowing multiple copies of same element, whose count
operation returns the number of times an element is present in set. Time complexity of this operation is O(log n + f) where log n factor searches for element and f factor iterates through sorted set to get count. Declared as multiset<int> ms
.
Maps
A map is a set of ordered pairs called key and value where keys must be unique but values can be repeated. Supported operations are addition and removal of key-value pair and retrieval of values for a given key. Unordered maps perform aforementioned methods in O(1) whereas for ordered maps it’s O(log n), sorted in order of key.
Unordered Maps: In map m
, m[key] = value
operator assigns value to a key and places the pair on the map, m[key]
returns value associated with the key, count(key)
checks for existence of key in the map and erase(it)
removes pair associated with a key or iterator. Declared as unordered_map<int, int> m
.
Ordered Maps: Supports additional operations lower_bound
and upper_bound
which return iterators pointing to lowest entry not less than/ strictly greater than a specified key.
map<int, int> m; // [(3,5); (11,4)]
m[10] = 491; // [(3,5); (10,491); (11,4)]
cout << m.lower_bound(10)->first << " " << m.lower_bound(10)->second << "\n";
// 10 491
cout << m.upper_bound(10)->first << " " << m.upper_bound(10)->second << "\n";
// 11 4
m.erase(11); // [(3,5); (10,491)]
Elementary Techniques
Simulation
Simulation refers to the act of doing precisely what the problem statement states and nothing else; essentially simulating it.
Complete Search
Brute forcing through all the possible cases in solution space to arrive at the solution. To iterate through all permutations of a list:
do {
check(v); // process or check the current permutation for validity
} while(next_permutation(v.begin(), v.end()));
Sorting and Comparators
C++ has built-in function for sorting in ascending order: std::sort(arr, arr+N)
or for a vector sort(v.begin(), v.end())
. For sorting in a self-defined order, one must use a custom comparator.
Greedy Algorithms
Algorithms that select the most optimal choice at each step, instead of looking at the solution space as a whole. Usually in a greedy algorithm, there is a heuristic or value function that determines which choice is considered most optimal. The choice of the greedy algorithm matters too, for example in a scheduling problem choosing earliest starting next event would be incorrect, instead one should go for earliest ending next event because that would give one more choices for future events.
Greedy won’t work in all scenarios though, for example in the fairly popular coin change problem, if the denominations are {1,3,4}
then greedy solution would be {4,1,1}
but the correct least amount of coins would be two {3,3}
. Similarly it cannot work for the knapsack problem which is solved using Dynamic Programming.
Graph Theory
Representation
Graphs (N vertices and M edges) are usually given in the format: N
M
followed by the M edges
each showing the connecting vertices. One thing to note is that a graph should be stored globally and statically, for access outside the main method. A graph can be represented in three ways:
Adjacency List
For using DFS, BFS, Dijkstra or other single-source
traversal algorithms, it’s the preferred way of storing graphs. In it, an array of length N of lists is maintained.
They take up O(M+N)
space but allow for easy traversal among the neighbors of a vertex. Often, there’s a need to maintain a visited
array.
int n, m;
vector<int> adj[MAXN];
bool visited[MAXN];
int main(){
cin >> n >> m;
for(int i = 0; i < m; i++){
int a, b;
cin >> a >> b;
a--; b--; // subtract 1 for vertex since array is zero-indexed
adj[a].push_back(b);
adj[b].push_back(a); // omit for directed graph
}
}
// For a weighted graph:
struct Edge
{
int to, weight;
Edge(int dest, int w):
to(dest), weight(w)
{
}
}
Adjacency Matrix
This is an N x N
2D array that stores for each pair of indices(a,b) whether an edge exists between them or not. Primarily used for Floyd-Warshall Algorithm.
int n, m;
int adj[MAXN][MAXN];
int main(){
cin >> n >> m;
for(int i = 0; i < m; i++){
int a, b;
cin >> a >> b;
a--; b--;
adj[a][b] = 1; // or w for weighted graph
adj[b][a] = 1; // ignore this if directed
}
}
Edge List
Usually used for weighted undirected graphs when sorting the edges by weight is needed (DSU). Its simply a single list of all edges (a, b, w)
where a and b are the vertices and w is the weight of connecting edge. Each edge is added only oncce.
struct Edge{
int a, b, w;
Edge(int start, int end, int weight):
a(start), b(end), w(weight)
{
}
bool operator<(const Edge & e)
const{
return w < e.w; // ascending weight sort
}
};
int n, m;
vector<Edge> edges;
int main(){
cin >> n >> m;
for(int i = 0; i < m; i++){
int a, b, w;
cin >> a >> b >> w;
a--; b--;
edges.push_back(Edge(a, b, w)); // add edge to list
}
sort(edges.begin(), edges.end());
}
Traversal
Breadth-First Search (BFS)
Visits nodes in order of distance away from the starting node; first visit nodes that are one edge away then those that are two edges away and so on. It can be used for finding the distance from a starting node to all nodes in an unweighted graph.
void bfs(int start){
const int total_nodes = n;
memset(dist, -1, sizeof dist); // fill distance array with -1s
queue<int> q;
dist[start] = 0;
q.push(start);
int seen = 1;
while(!q.empty()){
int v = q.front();
q.pop();
for(int e: adj[v]){
if(dist[e] == -1){
dist[e] = dist[v] + 1;
if(++seen == total_nodes) break; // see: https://observablehq.com/@yurivish/efficient-graph-search
q.push(e);
}
}
}
}
Once BFS finishes, the array dist
contains the distances from the start node to each node.
Depth-First Search (DFS)
Continues down a single path as far as possible until it has no more vertices to visit along that path, then backtracks and finds more vertices to visit.
void dfs(int node){
visited[node] = true;
for(int next : adj[node]){
if(!visited[next]){
dfs(next);
}
}
}
If stack overflows are encountered with recursive DFS, it can be implemented iteravely by storing nodes in the BFS implementation on a stack instead of a queue.
Floodfill
Its DFS but on a grid and the aim is to find the connected component of all the connected cells with the same number. As opposed to an explicit graph where the edges are given, a grid is an implicit graph where the neighbours are nodes adjacent in the four directions.
When doing floodfill, an N x M
array of bools visited
is maintained and a global variable for the size of currently visiting component. The search function is called recursively from squares on all four sides of the current one.
int grid[MAXN][MAXM];
int n, m;
bool visited[MAXN][MAXM];
int currentCompSize = 0;
void floodfill(int r, int c, int color){
if(r < 0 || r >= n || c < 0 || c >= m) return; // outside grid
if(grid[r][c] != color) return; // wrong color
if(visited[r][c]) return; // already visited
visited[r][c] = true; // mark current sq as visited
currentCompSize++;
// recursively call floodfill for neighbour sqs
floodfill(r, c+1, color);
floodfill(r, c-1, color);
floodfill(r-1, c, color);
floodfill(r+1, c, color);
}
int main(){
/*
* additional stuff here
*/
for(int i = 0; i < n; i++){
for(int j = 0; j < m; j++){
if(!visited[i][j]){
currentCompSize = 0;
floodfill(i, j, grid[i][j]);
}
}
}
}
Disjoint-Set Union Data Structure
It supports two operations:
- Add an edge between two nodes
- Check if two nodes are connected
For this, the sets are stored as trees; initially each node is its own set then the sets are combined when an edge is added between two nodes.
int parent[MAXN]; // store root of each set
void initialize(int N){
for(int i = 0; i < N; i++)
parent[i] = i; // initially, root of each set is node itself
}
int find(int x){ // find root of set of x
if(x == parent[x])
return x; // if x is its parent, it is the root
else
return find(parent[x]);
}
void union(int a, int b){ // merge sets of a and b
int c = find(a); // find a's root
int d = find(b); // find b's root
if (c != d)
parent[d] = c; // merge sets by setting parent of d to c
}
The naive recursive implementation of find can be improved from O(nm)
by path compression; the idea being reassignment of nodes on recursive calls to find
to prevent formation of long chains and the runtime becomes O(n log n)
.
int find(int x){
if(x == parent[x])
return x;
else
return parent[x] = find(parent[x]);
}
Miscellaneous
- DAGs (Directed Acyclic graphs) by virtue of not having any cycles allows them to have an ordering of nodes such that for any edge from u to v, u appears before v (topological sorting).
- Bipartite graph is such that each node can only be colored by 2 colors such that no adjacent nodes share the same colour. A graph is bipartite iff there are no cycles of odd length. A modified BFS can be use to check whether a graph is bipartite or not.
More Techniques
Prefix Sums
To process queries to find the sum of elements between two indices in a list, prefix sum is useful. Using 1-index in the array is beneficial i.e. assigning arr[0] = 0
and hence prefix[0] = 0
.
\(prefix[k] = \sum_{i=1}^{k}arr[i] = prefix[k-1] + arr[k]\)
For processing Q queries consisting on an array of N elements, the complexity is O(N+Q)
.
TODO: DP, 2D prefix sum as 1-D only
Binary Search
Basic Number Theory
Prime factorization of a number is computed by this algorithm in \(O(\sqrt{n})\):
i | n | v |
---|---|---|
2 | 252 | {} |
2 | 126 | {2} |
2 | 63 | {2,2} |
3 | 21 | {2,2,3} |
3 | 7 | {2,2,3,3} |
7 | 1 | {2,2,3,3,7} |
GCD using Euclidean Algorithm in O(log min(a,b))
:
int gcd(int a, int b){
if(!b) return a;
return gcd(b, a%b);
}
LCM can be computed using GCD by \(\frac{a \times b}{gcd(a,b)}\)
Modular Arithmetic is useful for dealing with overflows by taking remainders:
\[\begin{align*} (a \pm b)\mod m &= (a\mod m \pm b\mod m)\mod m \\ (a \times b)\mod m &= ((a\mod m) \times (b\mod m))\mod m \\ a^{b}\mod m &= (a\mod m)^{b}\mod m \end{align*}\]
Additional Topics
- Two Pointers
Iterate across an array that track the start and end of an interval or values in a sorted array. Both pointers are monotonic i.e. start at one end of array and move in only one direction.