Notes

Nuclear Norm via SDP

ml

:PROPERTIES: :CUSTOM_ID: matrix-norm :END:

Matrix norms

Given a matrix $X \in \mathbb{R}^{m \times n}$, $\sigma_{i}(X)$ denotes the $i$-th largest singular value of $X$ and is equal to the square root of the $i$-th largest eigenvalue of $XX’$. The rank of $X$, denoted as $\mathrm{rank}(X) = r$ is the number of non-zero singular values.

Inner Product

Given $X, Y \in \mathbb{R}^{m \times n}$, the inner product between $X$ and $Y$, denoted by $\langle X, Y\rangle$, is defined as $$ \langle X, Y \rangle := \mathrm{Tr}(X’Y) = \sum_{i=1}^m \sum_{j=1}^n X_{ij}Y_{ij} = \mathrm{Tr}(Y’X). $$

LP Duality

[optimization]

Estimating LP bounds

Given an optimization problem $$ \begin{align} \max_{f, s} &\quad 12f + 9s \ \st &\quad 4f + 2s \leq 4800 \ &\quad f + s \leq 1750 \ &\quad 0 \leq f \leq 1000 \ &\quad 0 \leq s \leq 1500 \ \end{align} $$ Suppose the maximum profit is $p^\star$. How can we bound $p^\star$? The lower bound of $p^\star$ can be found by picking any feasible point (since maximization). For example, ${f=0, s=0}$ is feasible. Therefore, $p^\star \geq 12f + 9s = 0$. Since any feasible point yields a lower bound of $p^\star$ and $p^\star$ itself is yielded by an feasible point, then finding the largest lower bound of $p^\star$ is equivalent to solving the LP.

RPC

[big data]

Networks

Network Interface Controllers (NICs) can connect a computer to different physical mediums, such as Ethernet and Wi-Fi. Every NIC in the world has a unique MAC (media access control) address. It consists of 48 bits. Therefore, there are 28 trillion possible MAC addresses. Some devices randomly change their MAC address for privacy.

You can use command ifconfig to check you network interface controller and its corresponding MAC address. There exists virtual interfaces as well. For example, lo, the loopback device, is a virtual interface. It connects your computer to a mini network containing just your computer.

Docker

[big data]

Virtualization

Virtualization is the illusion of private resources, provided by the software. We have virtual memory, virtual machine (hardware), virtual machine (languages), virtual operating system (container).

Containers and Virtual Machines are Sandboxes.

Perceptron Learning Algorithm

ml

Given a dataset \(\mathcal{D} = \{(\vec{x}_1, y_1), \cdots, (\vec{x}_N, y_N)\}\) and a hypothesis set \(\mathcal{H}\), our learning algorithm \(\mathcal{A}\) tries to learn a function \(g \in \mathcal{H}\) that approximates the underlying, true function \(f: \mathcal{X} \to \mathcal{Y}\), which generates the points in \(\mathcal{D}\).

Credit Card Approve Problem

Given a customer who is applying for a credit card, we want to build a system that determines if we should grant the application or not based on the customer's information such as age, annual salary, year in job, etc. The bank's historical credit approval data can be seen as a dataset \(\mathcal{D} = \{(\vec{x}_i, y_i)\}_{i=1}^N\)where each \(\vx_i \in \mathcal{X}\) and each represents a customer. There is a target function \(f: \mathcal{X} \to \mathcal{Y}\) that gives \(\vx\)'s credit behavior \(f(\vec{x}) = y\). Each \(\vx\) is a multidimensional vector where each component is a feature (age, for example). Our learning algorithm \(\mathcal{A}\) considers a hypothesis class \(\mathcal{H}\) and takes the dataset \(\mathcal{D}\) and tries to give a function \(g \in \mathcal{H}\) so that \(g\) performs similar to \(f\). We will use \(g\) as our system of approving credit card.

Clustering

ml

In unsupervised learning, there are no labels associated with features. Generally speaking, the ultimate goal of unsupervised learning is to find patterns and structures that help us to better understand data. Sometimes, we also use unsupervised learning to model a distribution. But we generally will not make predictions.

There are 3 types of clustering 1. Partitional (centroid, graph-theoretic, spectral) 1. Hierarchical (agglomerative, divisive) 2. Bayesian (decision-based, non-parametric)