Java Basic Data Structures

Array

An array in Java holds a fixed number of elements of a single type and is declared by defining the variable type with square brackets.

1String[] clothes;
2int[] numbers;

Arrays can be created with the new operator, or with array literals syntax that combines the creation of the array object with the initialization of its elements. In Java, array entries are initialized to their default values upon array creation if not explicitly set. Any primitive number types, such as int, long, byte, have default values of 0. char has a default value of \u0000, which is the null character. boolean has a default value of false, and all other reference types that hold objects, such as String, have a default value of null.

An element in an array is accessed by its index in O(1) time. The length of an array, array.length, is set and fixed at the time of creation.

1// create an array of numbers 1 to 5
2int[] numbers = {1, 2, 3, 4, 5};
3System.out.println(numbers.length);  // print out 5
4
5// create an array capable of holding 5 strings
6String[] clothes = new String[5];  // since String is of type object, clothes now hold 5 null values
7clothes[0] = "shirt";  // initialize first element
8clothes[1] = "dress";  // initialize second element

ArrayList

When you need to create a dynamically resizable array, Java provides a useful data structure ArrayList. Like arrays, ArrayList provides constant time O(1) performance for get(), set(), and size() methods. The add() operation, which appends a new element to the end of the ArrayList, runs in amortized constant time. However, using add() with specific indices runs in O(n) time on average. Another function that runs in linear O(n) time on average is remove(). When removing, we have to iterate the entire array to find the element qualifying for removal. ArrayList will resize itself when needed. A typical resizing implementation is that when the array is full, the array doubles in size, and the old content is copied over to the newer, larger array. The doubling takes O(n) time, but it happens rarely that the overall insertion still takes O(1) time.

In general, to iterate through a list, a for loop is easier to reason with than while since there's no condition to manage that could skip elements. In Java, there are two types of for loops: the simple for loop (with variable initialization, condition check, and variable increment/decrement like for-loops in C/C++) and the for-each loop.

1int[] numbers = {20, 6, 13, 5};
2// simple for loop goes through indices so we fetch elements using indices
3for (int i = 0; i < numbers.length; i++) {
4   int number = numbers[i];
5   System.out.println(number);
6}
7
8// for-each loop directly fetches elements
9for (int number : numbers) {
10   System.out.println(number);
11}

Linked List

Java doesn't have a built-in singly linked list. Normally at an interview, you'll be given a definition like this:

1class LinkedListNode {
2   public int val;
3   public LinkedListNode next;
4
5   public LinkedListNode(int val) {
6       this.val = val;
7   }
8}
  • append to end is O(1)
  • finding an element is O(N)

Besides problems that specifically ask for linked lists, you don't normally define and use linked list. If you need a list with O(1) append you can use ArrayList, and if you want O(1) for both prepend and append you can use ArrayDeque.

Stack

Official Java doc recommends ArrayDeque over the Stack class.

Deque<Integer> stack = new ArrayDeque<Integer>();

With a deque, you can push and pop on both ends in constant time.

Since we just need to push and pop in constant time on the tail, we also can use ArrayList to implement stack data structure with minimal memory consumption and ideal cache locality.

ArrayList<Integer> stack = new ArrayList<>();

  • push: add(item) adds item to end of stack, O(1)
  • pop: remove(stack.size() - 1) removes and returns item at the end of stack, O(1)
  • size: size(), O(1)
  • top: get(stack.size() - 1) returns but don't remove item at the end of stack, O(1)

The difference between the two is rather small and only in resizing.

Stack is arguably the most magical data structure in computer science. In fact, with two stacks and a finite-state machine you can build a Turing Machine that can solve any problem a computer can solve.

Recursion and function calls are implemented with stack behind the scene. We'll discuss this in the recursion review module.

Queue

โ†
โ†‘TA ๐Ÿ‘จโ€๐Ÿซ