ZAHIR MOHIDEEN: PostgreSQL

Showing posts with label PostgreSQL. Show all posts

Tuesday, December 18, 2018

GROUP BY behavior in MySQL / MariaDB

It is better to unlearn and relearn the SQL , if we need to work on different DBMS. I was surprised by the "GROUP BY" behavior in MySQL / MariaDB.

Typically , when we are using GROUP BY , all the columns except the aggregated columns mentioned in the SELECT clause has to be defined in the GROUP BY clause. If any columns are missed , then the prompt produces an error . However , in MySQL / MariaDB , this produces the result ( sum / avg ) . This may be a side effect ,which may not be the desirable result , we are looking for.

Let us create a simple table in Oracle , MS SQL Server , PostgreSQL , MySQL and MariaDB.When I issue the GROUP BY without mentioning other columns , all the DBMSes except MariaDB and MySQL produced an error.

Here is an example

ORACLE

SQL> Create table emp_hours
2 ( empid int ,
3 workweek int ,
4 hrstype varchar(10) ,
5 hrs number(5,2)
6 ) ;

Table created.

SQL> insert into emp_hours values ( 1 , 20 , 'REG' , 20 ) ;
1 row created.

SQL> insert into emp_hours values( 1 , 20 , 'OT' , 4.5 ) ;
1 row created.

SQL> insert into emp_hours values ( 2 , 20 , 'REG' , 40 ) ;
1 row created.

SQL> insert into emp_hours values( 2 , 20 , 'REG' , 35 ) ;
1 row created.

SQL> insert into emp_hours values ( 3 , 20 , 'OT' , 4.5 ) ;
1 row created.

SQL> commit;
Commit complete.

SQL> select empid , hrstype , sum(hrs) from emp_hours;
select empid , hrstype , sum(hrs) from emp_hours
*
ERROR at line 1:
ORA-00937: not a single-group group function

SQL> select empid , hrstype , sum(hrs) from emp_hours group by empid , hrstype ;

EMPID HRSTYPE SUM(HRS)
---------- ---------- ----------
3 OT 4.5
1 REG 20
2 REG 75
1 OT 4.5

MS SQL Server

1> Create table emp_hours
2> ( empid int ,
3> workweek int ,
4> hrstype varchar(10) ,
5> hrs numeric(5,2)
6> ) ;
7> go
1> insert into emp_hours values ( 1 , 20 , 'REG' , 20 ) ;
2> insert into emp_hours values( 1 , 20 , 'OT' , 4.5 ) ;
3> insert into emp_hours values ( 2 , 20 , 'REG' , 40 ) ;
4> insert into emp_hours values( 2 , 20 , 'REG' , 35 ) ;
5> insert into emp_hours values ( 3 , 20 , 'OT' , 4.5 ) ;
6> go

(1 rows affected)

(1 rows affected)

(1 rows affected)

(1 rows affected)

(1 rows affected)

1> select empid , hrstype , sum(hrs) from emp_hours;
2> go
Msg 8120, Level 16, State 1, Server AUSTIN02, Line 1
Column 'emp_hours.empid' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

1> select empid , hrstype , sum(hrs) from emp_hours group by empid , hrstype ;
2> go
empid hrstype
----- ------- -
1 OT 4.50
3 OT 4.50
1 REG 20.00
2 REG 75.00

(4 rows affected)

PostgreSQL

Similarly , we get the error as shown below in Postgres for the above test case.

ERROR: column "emp_hours.empid" must appear in the GROUP BY clause or be used in an aggregate function
SQL state: 42803

Character: 8

MySQL

mysql> select version();
+-------------------------+
| version() |
+-------------------------+
| 5.5.49-0ubuntu0.14.04.1 |
+-------------------------+
1 row in set (0.00 sec)

mysql> select empid , hrstype , sum(hrs) from emp_hours;
+-------+---------+----------+
| empid | hrstype | sum(hrs) |
+-------+---------+----------+
| 1 | REG | 104.00 |
+-------+---------+----------+
1 row in set (0.00 sec)

Instead of a error , the above SQL summed up the hours and produced one record.

Personally , I would prefer the above SQL to error out , instead of giving this answer.

mysql> select empid , hrstype , sum(hrs) from emp_hours group by empid , hrstype ;
+-------+---------+----------+
| empid | hrstype | sum(hrs) |
+-------+---------+----------+
| 1 | OT | 4.50 |
| 1 | REG | 20.00 |
| 2 | REG | 75.00 |
| 3 | OT | 4.50 |
+-------+---------+----------+
4 rows in set (0.05 sec)

MariaDB

Tuesday, January 24, 2017

Implementation difference between PostgreSQL and MS SQL Server in varchar with no length.

Today , I learned something new in PostgreSQL. Surprised to see the implementation difference between PostgreSQL and MS SQL Server .

In PostgreSQL and SQL Server , I can define a 'varchar' or 'char' datatype with no width . In PostgreSQL , 'char' datatype with no length means char(1) , whereas 'varchar' datatype with no length means that it can accommodate any size

Here is the example from PostgreSQL

sales=# SELECT version();
version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4, 64-bit
(1 row)

sales=# create table t( c char) ;
CREATE TABLE
sales=# insert into t values ('AA');
ERROR: value too long for type character(1)
sales=# insert into t values ('A');
INSERT 0 1
sales=# select * from t;
c
---
A
(1 row)

sales=# drop table t;
DROP TABLE
sales=#
sales=# create table t( c varchar) ;
CREATE TABLE
sales=# insert into t values ('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA')
sales-# ;
INSERT 0 1
sales=# Select * from t;
c
---------------------------------------------------------------------------------------
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Here is the example from SQL Server.

C:\>sqlcmd -W
1> use test
2> go
Changed database context to 'test'.
1> Select @@version ;
2> go

-
Microsoft SQL Server 2014 (RTM-CU14) (KB3158271) - 12.0.2569.0 (X64)
May 27 2016 15:06:08
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 (Build 14393: )

(1 rows affected)
1> drop table t;
2> go
1> create table t( c char) ;
2> go
1> insert into t values ('AA');
2> go
Msg 8152, Level 16, State 14, Server DOCS-0961, Line 1
String or binary data would be truncated.
The statement has been terminated.
1> insert into t values ('A');
2> go

(1 rows affected)
1> Select * from t;
2> go
c
-
A

(1 rows affected)
1> drop table t ;
2> create table t( c varchar) ;
3> go
1> insert into t values ('AA');
2> go
Msg 8152, Level 16, State 14, Server DOCS-0961, Line 1
String or binary data would be truncated.
The statement has been terminated.
1> insert into t values ('A');
2> go

(1 rows affected)
1> Select * from t;
2> go
c
-
A

(1 rows affected)

1>

As can be seen from the script above , PostgresQL and SQL Server behave the same way when a column is defined 'char' with no length. And it is so drastically different when it comes to 'varchar' with no length.

In Oracle , this is not allowed . Any 'char' or 'varchar' has to have a length defined .

Learning something new every day.

Thursday, April 14, 2016

Accessing Postgresql / MySQL / MS SQL Server from Oracle.

Gone are the days , where a business has only one DBMS to maintain its operations.

Most of the times , you are dealing with multiple DBMS in your organization.

The vendors provide some means of connecting to distributed databases . For instance ,

Oracle provides "database link" ( via transparent gateway), MS SQL Server provides "Linked Server" mechanism to connect to heterogeneous databases and other data sources.

Here , we will look at connecting to PostgreSQL from Oracle by using transparent ODBC gateway . This process is applicable any ODBC complaint data sources .

Oracle does provides transparent gateways for MS SQL Server , DB2 . But those needs to be licensed.

As of today , ODBC transparent gateway is not separately licensed . Please check with your rep for the licensing details.

Here are the steps.

Step 1 :

Configure ODBC connection and a create a system DSN for PostgreSQL

Let us call it as "pgsales”

Note : This needs to be configured as "System DSN" , not as User DSN.

Step2 :

Create a file under %ORACLE_HOME%\hs\admin in the form initpgsalesodbc.ora . The naming convention is “init” followed by desired name . In my case , I have it as initpgsalesodbc.ora.

Set the HS_FDS_CONNECT_INFO to the DSN name that was configured in ODBC setup .

I would recommend to keep the trace level to 'ON' for the initial setup . Once the setup is complete ; connectivity is established ; we can turn off the trace .

Oracle provides sample file . We can use clone this file to suit our needs.

Here are the contents of initpgsalesodbc.ora.

# This is a sample agent init file that contains the HS parameters that are

# needed for the Database Gateway for ODBC

# HS init parameters

HS_FDS_CONNECT_INFO = pgsales

HS_FDS_TRACE_LEVEL = ON

# Environment variables required for the non-Oracle system

#set =

Step 3 :

Add a entry in listener.ora to register this data source. The SID name must match with the init file created in the above step.

(SID_DESC =

(SID_NAME = pgsalesodbc)

(ORACLE_HOME = c:\app\oracle\product\12.1.0\dbhome_1)

(PROGRAM = dg4odbc)

)

Step 4:

Add entry in tnsnames.ora . The SID in connect_data should match SID defined in the listerner.ora and the file in hs\admin.

pgsqllink =

(DESCRIPTION=

(enable=broken)

(ADDRESS=(PROTOCOL=tcp)(HOST=ZAHIRSER)(PORT=1521))

(CONNECT_DATA=(SID= pgsalesodbc))

(HS=OK)

)

Step 5:

Restart the listener.You should see to make sure that the listener is servicing the services we created.

Service "pgsalesodbc" has 1 instance(s).

Instance "pgsalesodbc", status UNKNOWN, has 1 handler(s) for this service...

Step 6:

Create a database link .

In the following example , LINK2PGSQL is our db link , 'zahir' is the user id in the postgresql database and "passwd1234" is the password .

SQL> CREATE DATABASE LINK LINK2PGSQL CONNECT TO "zahir" IDENTIFIED BY "passwd1234" USING 'pgsqllink' ;

Database link created.

To test the connectivity , issue a select statement as below . Please note , for Postgresql databases , you will need to enclose the table/view names in quotes.

SQL> select * from "t"@LINK2PGSQL;

With this feature , we can manipulate the data stored in heterogeneous databases .

Comments Welcome .