PostgreSQL is a powerful, open-source object-relational database system. This article will guide you through the installation process, basic operations, and understanding its architecture.
All files related to this article can be found in my GitHub repository at the following link:
PostgreSQL_Introduction
.
In addition to the Jupyter Notebook containing the same content as this article, you will find both students_20.csv and students_30.csv for database import, along with the connect_and_load.py and query_database.py scripts. These scripts provide the same functionality as the Python code at the end of this article, allowing you to run the code independently of the notebook environment.
PostgreSQL can be installed on various operating systems. Below are the instructions for the most popular ones.
Ubuntu includes PostgreSQL in its default repositories. To install it, follow these steps:
The easiest way to install PostgreSQL on Windows is using the interactive installer provided by EDB:
.exe file.postgres superuser, choose a port (default is 5432), and select components like pgAdmin 4.On MacOS, using Homebrew is the most popular choice for developers:
Once installed, you can interact with PostgreSQL using the psql command-line tool.
To log in as the default administrative user on Linux:
On Windows/MacOS you might need:
and provide password to postgres role that was set during installation (or configured afterward).
Inside the psql prompt, run the following commands to set up a new environment:
Connect to your database and create a simple table:
To add a new column for the enrollment date, use the ALTER TABLE command:
You can insert individual records using the INSERT INTO statement:
If you have a large dataset in a CSV file, you can use the \copy command in psql. This is very efficient for bulk imports:
Note: The CSV file must be accessible from the machine where you are running the psql client. By default, it is looked for in the folder from which you started psql because a relative path is provided. You can also provide an absolute path instead of just the file name 'students_20.csv'.
PostgreSQL follows a client-server model:
postgres) manages the database files, accepts connections, and executes queries.Because of this architecture, clients can connect to PostgreSQL even if it is running on another computer.
As a background service, PostgreSQL reserves a network port to listen for client connections. The default port is 5432. To establish a connection, a client typically requires: Host, Port, Database Name, User Name, and Password.
Below is a Python example using psycopg that demonstrates how to connect, load data from a second CSV file (students_30.csv), and count the total records.
The following Python script provides an interactive way to query and modify your PostgreSQL database directly from the notebook. It runs in a loop, allowing you to execute multiple SQL commands sequentially. For SELECT queries, it automatically formats and displays the results as a table, while for other operations (like INSERT, UPDATE, or DELETE), it confirms the success of the operation. You can exit the interactive mode at any time by typing q or exit.
Problem: "Peer authentication failed for user 'postgres'".
Solution: Linux defaults to OS-level auth. Use sudo -u postgres psql or update pg_hba.conf.
Problem: Service fails to start or "Port 5432 already in use".
Solution: Use Resource Monitor to find conflicting processes.
Problem: "psql: error: could not connect to server".
Solution: Ensure server is running with brew services start postgresql.
+48 790-430-860
analysislessons@gmail.com