Introduction to Postgres

In this section, we study what is Postgres and what is SQL.

What is Postgres?

What is PostgreSQL?

PostgreSQL is an advanced, enterprise-class, and open-source relational database system. PostgreSQL supports both SQL (relational) and JSON (non-relational) querying.

PostgreSQL is a highly stable database backed by more than 20 years of development by the open-source community.

PostgreSQL is used as a primary database for many web applications as well as mobile and analytics applications.

Why use PostgreSQL?

PostgreSQL comes with many features aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments, and help you manage your data no matter how big or small the dataset. In addition to being free and open source, PostgreSQL is highly extensible. For example, you can define your own data types, build out custom functions, even write code from different programming languages without recompiling your database!

Installing Procedural Languages

PostgreSQL allows user-defined functions to be written in other languages besides SQL and C. These other languages are generically called procedural languages (PLs). For a function written in a procedural language, the database server has no built-in knowledge about how to interpret the function's source text. Instead, the task is passed to a special handler that knows the details of the language. The handler could either do all the work of parsing, syntax analysis, execution, etc. itself, or it could serve as “glue” between PostgreSQL and an existing implementation of a programming language. The handler itself is a C language function compiled into a shared object and loaded on demand, just like any other C function.

There are currently four procedural languages available in the standard PostgreSQL distribution: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python.

What is SQL?

Structured query language (SQL) is a programming language for storing and processing information in a relational database. A relational database stores information in tabular form, with rows and columns representing different data attributes and the various relationships between the data values. You can use SQL statements to store, update, remove, search, and retrieve information from the database. You can also use SQL to maintain and optimize database performance.

More information about SQL in the next link:

Creating a Postgres database in AWS linked to our project

So far, we have used a sqlite database to develop our app. Now, it is time to deploy our app in AWS while we connect our data to a Postgres database, i.e., we must create a production database. The links to the YouTube video, and the GitHub account for this section are below:

Procedure to connect our data to a Postgres database

The procedure to connect our data to a Postgres database consists of three steps:

  1. In Amazon Web Service. In the environment where we have hosted our app, we must move to configuration and in the database, we must create a database and modify the inbound rules to allow the access to the database.. Follow the instructions in the video.

  2. In our local computer. Once that we have created a database in AWS, we must create all the databases and the tables. We must do that from our local computer following the next steps:

    1. In the __init__.py file, we must type the next code:
      
      DBVAR = 'postgresql://username:password@endpoint:5432/ebdb'
      application.config['SQLALCHEMY_DATABASE_URI'] = DBVAR 
      application.config['SQLALCHEMY_BINDS'] = {'transport': DBVAR}                                    
                                      
      Where in DBVAR, you must introduce the next five blocks of information:

      • Username: This is the username that you chose when you created the database. I suggest you choose the username postgres, since when we program the database by using the command line and pgAdmin, that is the standard name. However, choose the name that you decide, but be sure that you remember it.

      • Password: This is the password that you choose when you create the database. Remember that password, since we will need it later on to connect to the database.

      • Endpoint: This is the endpoint to the database that you get from the RDS page in AWS.

      • Port: The port for postgres databases is 5432.

      • Database name: The database name when you create a database inside Elastic Beanstalk is ebdb by ElasticBeanstalkDataBase.

    2. In the terminal, we must type: python. Once that we are in the python terminal, we must type: from capp import db, and then: db.create_all().

  3. In GitHub. As with the SECRET_KEY, we want to protect sensitive information in our GitHub account, since we do not want to disclose our username, password, and the endpoint for our database. To protect that information, we need to introduce the next lines of code and place that code in GitHub:
    
    DBVAR = f"postgresql://{os.environ['RDS_USERNAME']}:{os.environ['RDS_PASSWORD']}
    @{os.environ['RDS_HOSTNAME']}/{os.environ['RDS_DB_NAME']}"
                                   
  4. In AWS, got to CodePipeline and release the changes. One that we have release the changes, we should verify that our app works properly by populating the database (register a new user and adding some entries in the carbon app). We also must verify that the login functionality works properly.

Connection and queries to our Postgres database

Connection to our Postgres database by using pgAdmin

Once that we have created our databases with tables, it is the time to access to those databases and to make some queries to use the information in those databases. We use pgAdmin to access to our databases. We proceed in three different steps.

  1. Download pgAdmin for Mac or Windows:

  2. In AWS, we must allow our database to be accessed from different IPs (university, home, etcetera). To do that you have to follow the next steps (follow the video instructions).

    1. In AWS RDS, access to your database.

    2. Select the VPC security groups.

    3. Select the Inbound rules.

    4. Edit the inbound rules by creating a new inbound rule, where you allow all the traffic from your current IP. Be aware that HVL, the IP changes regularly, therefore, you will probably need to edit the inbound rules several times.

  3. In pgAdmin, we must connect to the database hosted in AWS server by following the next steps:

    1. Create a new server connection, and name it as you prefer.

    2. In the connection introduce the EndPoint from your RDS Amazon database.

    3. In the connection, you need to also introduce the user and the password that you created in AWS.

Queries in our database by using pgAdmin

All the queries in this section are an adaptation to our databases of the queries in the next video:

We are going to introduce some useful queries.

SELECT command:

SELECT * FROM public.user_table;
SELECT username, email FROM public.user_table;
SELECT password FROM public.user_table;
SELECT kms, tansport FROM public.transport_table;
                
ORDER BY command:

SELECT * FROM public.user_table ORDER BY username;
SELECT * FROM public.user_table ORDER BY username ASC;
SELECT * FROM public.user_table ORDER BY username DESC;
SELECT * FROM public.transport_table ORDER BY kms DESC;
SELECT * FROM public.transport_table ORDER BY fuel ASC;
                
WHERE, AND clauses:

SELECT * FROM public.transport_table WHERE fuel= 'Diesel';
SELECT * FROM public.transport_table WHERE fuel= 'Diesel' 
AND transport= 'Car';
SELECT * FROM public.transport_table WHERE fuel= 'Diesel' 
AND (transport= 'Car' or transport= 'Motorbike');
SELECT * FROM public.transport_table WHERE fuel= 'Diesel' 
AND (transport= 'Car' or transport= 'Motorbike') ORDER BY transport DESC; 
                
LIMIT, and OFFSET:

SELECT * FROM public.user_table LIMIT 10;
SELECT * FROM public.transport_table OFFSET 5 LIMIT 5; 
                
IN:

SELECT * FROM public.transport_table WHERE transport 
IN ('Car', 'Ferry');
SELECT * FROM public.transport_table WHERE transport 
IN ('Car', 'Ferry') ORDER BY fuel;
                
Calculating MIN, MAX and AVG:

SELECT MAX(kms) FROM public.transport_table;
SELECT MIN(kms) FROM public.transport_table;
SELECT AVG(kms) FROM public.transport_table;
SELECT ROUND(AVG(kms)) FROM public.transport_table;
SELECT transport, MIN(co2) FROM public.transport_table 
GROUP BY transport;
SELECT fuel, ROUND(AVG(co2)) FROM public.transport_table 
GROUP BY fuel;
                
SUM:

SELECT SUM(kms) FROM public.transport_table;
SELECT transport, SUM(co2) FROM public.transport_table 
GROUP BY transport;
                
Timestamps and Dates:

SELECT NOW();
SELECT NOW()::DATE;
SELECT NOW()::TIME;
                
Timestamps and Dates:

SELECT NOW();
SELECT NOW()::DATE;
SELECT NOW()::TIME;
                
Adding and Subtracting with Dates:

SELECT NOW() - INTERVAL '1 YEAR';
SELECT NOW()::DATE - INTERVAL '1 YEAR';
SELECT NOW()::DATE - INTERVAL '5 DAYS';
SELECT (NOW() + INTERVAL '5 DAYS')::DATE;
                
Extracting Fields:

SELECT EXTRACT(YEAR FROM NOW());
SELECT EXTRACT(MONTH FROM NOW());
SELECT EXTRACT(DAY FROM NOW());
SELECT EXTRACT(DOW FROM NOW());
SELECT EXTRACT(CENTURY FROM NOW());
                
DELETE records:

DELETE FROM  public.user_table WHERE id=1;
DELETE FROM  public.transport_table WHERE id=1;
                
UPDATE records:

UPDATE public.user_table  SET email='hello@demo.com' 
WHERE id=1;
SELECT * FROM public.user_table  WHERE id=1;
UPDATE public.user_table  SET email='hello1@demo.com',
username='name1' WHERE id=1;
SELECT * FROM person WHERE id=1;
                
ON CONFLICT, DO NOTHING:

INSERT INTO public.user_table 
(id, username, email, password)
VALUES (1, 'name2', 'email2@demo.com', 'regn')
ON CONFLICT (id) DO NOTHING;
SELECT * FROM public.user_table;
                
UPSERT:

SELECT * FROM public.user_table  WHERE id=1;
INSERT INTO public.user_table 
(id, username, email, password) 
VALUES (1, 'name2', 'email4@demo.com', 'regn')
ON CONFLICT (id) DO UPDATE SET email=EXCLUDED.email;
                
INNER JOINS:

SELECT * FROM public.user_table
JOIN public.transport_table ON 
public.transport_table.user_id = public.user_table.id;
                
LEFT JOINS:

SELECT * FROM public.user_table
LEFT JOIN public.transport_table ON 
public.transport_table.user_id = public.user_table.id;
                

Weekly challenge

In this weekly challenge, we develop some queries that can be relevant to extract information from our database. Based on the example queries in the previous section, please create your own queries to extract information from your databases.