SQL
Structured Query Language (SQL) is a language for storing, manipulating, and retrieving data in databasesBasic Queries
Select: get values from database
// general select query SELECT column1, column2 FROM table_name; // select all columns SELECT * FROM table_name; // list only distinct values SELECT DISTINCT column1, column2 FROM table_name; // order results by column SELECT column1, column2 FROM table_name ORDER BY column2, column1 ASC|DESC; // select with conditional where clause SELECT column1, column2 FROM table_name WHERE condition; // select null/non-null values SELECT column1, column2 FROM table_name WHERE column1 IS (NOT) NULL; // select top n rows from table SELECT TOP n column1, column2 FROM table_name WHERE condition; // select top p percent of rows from table SELECT TOP p PERCENT column1, column2 FROM table_name WHERE condition; // alternative to top for PSQL & MySQL SELECT column1, column2 FROM table_name WHERE condition LIMIT n; // min and max values SELECT MIN(column1) as Minimum FROM table_name; SELECT MAX(column1) as Maximum FROM table_name; // count num of records matching condition SELECT COUNT(column1) FROM table_name WHERE condition // avg value of records matching condition SELECT AVG(column1) FROM table_name WHERE condition // sum of values of records matching condition SELECT SUM(column1) FROM table_name WHERE condition // match records like a pattern, where pattern matches below SELECT column1, column2 FROM table_name WHERE columnN LIKE pattern // IN is short for multiple OR operations SELECT column1, column2 FROM table_name WHERE column2 IN (value1, value2); // IN can also be used with SELECT SELECT column1, column2 FROM table_name WHERE column2 IN (SELECT * FROM other_table); // return values BETWEEN inclusive range SELECT column1, column2 FROM table_name WHERE column2 BETWEEN value1 AND value2; // aliases, call columns by specified name SELECT column1 as col1, column2 as col2 FROM table_name WHERE condition; // can also use aliases for tables SELECT a.col1, a.col2, b.col1 FROM table_a as a, table_b as b WHERE a.prop = 'prop'; // this shortens standard dot notation SELECT table_a.col1, table_a.col2, table_b.col1 FROM table_a, table_b WHERE table_a.prop = 'prop'; // joins are used to combine two tables based on a common column // standard JOIN syntax SELECT column1, column2 FROM table1 <JOIN_TYPE> JOIN table2 ON table1.common_col = table2.common_col; // INNER JOIN is like intersection; joins on common values across both tables SELECT column1, column2 FROM table1 INNER JOIN table2 ON table1.common_col = table2.common_col; // LEFT JOIN matches all records from left table, and merges with corresponding values from right table SELECT column1, column2 FROM table1 LEFT JOIN table2 ON table1.common_col = table2.common_col; // RIGHT JOIN matches all records from right table, and merges with corresponding values from left table SELECT column1, column2 FROM table1 RIGHT JOIN table2 ON table1.common_col = table2.common_col; // OUTER JOIN is like union; joins on any matching records in either table SELECT column1, column2 FROM table1 FULL OUTER JOIN table2 ON table1.common_col = table2.common_col; // SELF JOIN joins a table with itself as if it were two; used to perform relative comparisons SELECT A.column1, A.column2 FROM table1 A, table1 B WHERE A.col1 < B.col1; // UNION used to combine results from multiple select statements SELECT column1, column2 FROM table1 UNION SELECT column1, column2 FROM table2 // by default, union only selects distinct values; use ALL to allow duplicates SELECT column1, column2 FROM table1 UNION ALL SELECT column1, column2 FROM table2 // group by // concat function
Update: change values already in table
// conditional update UPDATE table_name SET column1 = value1, column2 = value2 WHERE condition; // omitted WHERE clause updates all records UPDATE table_name SET column1 = value1, column2 = value2;
Insert: add new records to table
// general insert query INSERT INTO table_name (column1, column2) VALUES (value1, value2) // insert values for all columns INSERT INTO table_name VALUES (value1, ..., valuen)
Delete: delete existing records in table
// conditional delete statement DELETE FROM table_name WHERE condition; // delete all records from table DELETE FROM table_name; // delete all records quickly for large tables // it effectively drops and recreates the table // instead of deleting rows. This frees storage space // immediately, which can be critical in some cases (ahem). TRUNCATE TABLE table_name;
// describe table schema
DESCRIBE table_name
// change column type
ALTER TABLE table_name
ALTER COLUMN column1 TYPE newtype,
ALTER COLUMN columnN TYPE newtype,
// rename column
ALTER TABLE table_name
RENAME COLUMN col1 TO col2
// add column after table creation
ALTER TABLE table_name
ADD column1 datatype
// add primary key after table creation
ALTER TABLE table_name
ADD PRIMARY KEY (column1, ..., columnN)
// rename table
ALTER TABLE table_name
RENAME TO new_table_name
// add/remove not null constraint
ALTER TABLE table_name
ALTER COLUMN column1 SET [DROP] NOT NULL;
Standards
// operators for conditional statements
Arithmetic Operators+ Addition
- Subtrsction
* Multiplication
/ Division
% Modulus
for WHERE conditions)
Comparison Operators (= check if values are equal
!= check if values arent equal
>, < greater than, less than comparisons
>=, <= greater than or equal, less than or equal
- BETWEEN: between an inclusive range
- LIKE: search for a pattern, where pattern can have wildcards
- "%": represents 0, 1, or more characters
- "_": represents 1 character
- IN: specify multiple possible values for column
WHERE Conditions
Joining AND
OR
NOT
Logical OperatorsALL
AND
ANY
BETWEEN
EXISTS
IN
LIKE
NOT
OR
IS NULL
UNIQUE
-in Datatypes Built
Database Setup and Management
// create new database
CREATE DATABASE dbname
// drop/delete database
DROP DATABASE dbname
// select database for use
USE dbname
// create table within database
CREATE TABLE table_name(
NOT NULL,
column1 datatype
column2 datatype,...
columnN datatype,PRIMARY KEY (column1, column2)
)
// create table within foreign key
CREATE TABLE table_name(
NOT NULL,
column1 datatype
column2 datatype,...
columnN datatype,PRIMARY KEY (column1, column2)
FOREIGN KEY (column3, column4) REFERENCES other_table (column3, column4)
ON UPDATE CASCADE
)
// describe table
DESC table_name
// drop/delete table
DROP TABLE table_name
PostgreSQL
Setup
# install on ubunutu
$ sudo apt-get install postgresql
# installation creates a postgres user account
# enter interactive postgres session
$ sudo -u postgres psql
# the 'sudo -u postgres cmd' executes 'cmd' as user postgres.
$ sudo -i -u postgres
# will switch the account to postgres, and you can simply use 'psql' from there on
# interactively set up new user
$ sudo -u postgres createuser --interactive
# create corresponding database
$ sudo -u postgres createdb <newuser>
# by default, typing 'psql' will try to use a user with same name as linux user
# the pg_hba.conf also restricts using psql to user based authentication only,
# meaning psql asks the OS for a linux user login name and uses that as authentication.
# change this by changing 'peer' to 'trust' in '/etc/postgresql/9.5/main/pg_hba.conf'
# for the first two uncommented config lines.
# then, restart postgres server
$ sudo service postgresql restart
# you can now access psql with the new user by calling
$ psql -U <newuser>
# note how you no longer have to execute commands from the postgres user
# login with user postgres:
$ psql -U postgres
# set password for new user or for postgres
postgres$ ALTER USER postgres WITH PASSWORD 'new_pass';
# now that user passwords are set, go back to pg_hba.conf and change
# 'trust' to 'md5' to allow passwords for psql user login
# use psql (from any linux account) under new user with
$ psql -U <newuser>
# where you will prompted with the user's password. This will connect to
# the database sharing the name 'newuser'
Usage
# OTHER COMMANDS
$ psql -h <myhost> -p <port> -U <user> -d <dbname>
# use this to be explicit about the server, port, user, and database you're connecting to
# gives info on current connection
psql$ \conninfo
# list databases
psql$ \l
# show all tables in database
psql$ \dt
# describe a table
psql$ \d <tablename>
# switch to database
psql$ \c <db-name>
# setting up a tablespace, a defined location for databases/tables to be stored,
# can be helpful if you need to store a particular db on a different partition
CREATE TABLESPACE <name> LOCATION '<path>';
# an existing db can then be set to this location
ALTER DATABASE <dbname> SET TABLESPACE <name>;
# see absolute table sizes on disk
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 5;
# restoring/creating a table in a .sql written with pg_dump
# note the "-1" stands for "single transaction"
# also note the table name will be the same as before; to avoid conflicts
# consider using a new test database and make sure the tablespace is correctly set
psql -U <user> -d <database> -1 -f /path/to/file.sql
Errors
No Disk Space
The main errors I’ve encountered when working with Postgres have occurred when I no longer have any space left on the system (mainly due to the database itself). I will then try to clear up some rows by running a delete command (if Postgres can even start), and will get a “PANIC” error and the server immediately shuts down, and can’t restart. This is mostly due to the fact that there is not space to write temp files during deletion, and then when restarting the server Postgres doesn’t have enough space to write to the important pg_xlog
directory (see here for more details). In this case, Postgres won’t even be able to start for you to actually delete more rows. So far, the only solution I’ve seen to this problem is to attempt to free up just enough space elsewhere on the machine to allow Postgres to be able to write files to this directory and actually startup. So to address this problem, clear up some space (a few hundred MB if possible, the equivalent of a Python venv) and restart Postgres.
Backing up a table directly another machine
When the above situation comes up, I typically want to back up the largest tables or databases to my local machine without storing any sort of temporary files on the already full server. One way this can be done is pg_dump
ing straight over ssh using the following machine:
ssh -i <key> <user>@<ip> "PGPASSWORD='<password>' pg_dump -U <user> -d <db> -t <table>" \ >> <path>/backup.sql
Restoring a tablespace from a separate location
I faced this issue due to a changed drive name. I had an existing postgres tablespace under a partition named “data_ext” that I later renamed to “data”. Postgres was (understandably) unable to find this tablespace location. Apparently Postgres doesn’t allow you to change tablespace locations on the command line, so I had to find a way around it with the tbl_spc directory. This variable was not defined for me, but ended up being located at /var/lib/postgresql/10/main/pg_tblspc
, and held a single symbolic link to the manually created tablespace at the old “data_ext” location. Simply rewriting this symbolic link with sudo ln -s /new/location /absolute/path/to/symlink
updated the location properly. Note that I stopped the Postgres service before doing this to ensure nothing crazy happened while it was running.