how to use distinct in impala - sql

HI I am trying to query the distinct localities in my table.
Here is my query.
select distinct city,locality, avg_sqft from real_estate.re_search where city = 'bangalore' AND locality != 'jayanagar';
Result
+-----------+--------------+----------+
| city | locality | avg_sqft |
+-----------+--------------+----------+
| bangalore | bannerghatta | 13500 |
| bangalore | kormangala | 18000 |
| bangalore | kodipur | 7000 |
| bangalore | kormangala | 16000 |
| bangalore | horamavu | 9000 |
| bangalore | bellandur | 15500 |
| bangalore | kodipur | 9000 |
| bangalore | madivala | 12000 |
| bangalore | varthur | 12000 |
| bangalore | kormangala | 13500 |
| bangalore | bellandur | 13000 |
| bangalore | kodipur | 11500 |
| bangalore | kormangala | 14000 |
the problem is I need to display the distinct locality in result.any help will be appreciated.

You should be able to get a list of distinct members of the locality column in your table, where the city is Bangalore by using the COUNT and GROUP BY operators:
SELECT city
,locality
,COUNT(locality)
FROM database.table
WHERE city = 'Bangalore'
GROUP BY city
,locality;

Related

Please I need to select city calling which has number of KM travelled is greater than 1000

select CITY_CALLING
sum(DISTANCE_KM)
from REAL_TRIP join
SOURCE_CITY
on SOURCE_CITY.city_id = REAL_TRIP.city_id
group by 1
city_CALLING | sum |
Visakhapatnam | 14.5920725980000014 |
Hyderabad | 2759.24699709970082 |
San Diego | 87.3699351497999999 |
Moscow | 984.947118170600447 |
Alexandria | 8.96134862429999934 |
Prague | 86.0471747345999916 |
Recife | 20.7398930000000021 |
Leeds | 140.606494992300014 |
Copenhagen | 14.7657918324999997 |
Fresno | 29.6572209023999989 |
Tijuana | 61.7240377603999946 |
Baton Rouge | 7.05829104329999968 |
Krasnodar | 296.730780097399986 |
Sochi | 237.51827971039998 |
Cincinnati | 116.423747349400003 |
Guwahati | 1057.34938192379968 |
Champaign | 6.8250736618000003 |
Vienna | 1180.11211812669899 |
Charlotte | 150.293475570500021 |
Raleigh-Durham | 152.720579113999946 |
select CITY_CALLING
sum(DISTANCE_KM)
from REAL_TRIP join
SOURCE_CITY
on SOURCE_CITY.city_id = REAL_TRIP.city_id
group by 1
HAVING SUM(DISTANCE_KM) > 10000;

How to distinctly select a column while selecting other columns

I have a table that looks something like this in hive. What I want to do is run a query such that every 3 hours, I look at unique workerUUIDs and do some manipulation on them. So what I want to do is between now and 3hrs before
Capture all the unique workerUUIDs
Select * from these workerUUIDs
I am using hive to run this query and the table has a few million entries every three- six hours. What is the best way to write this query?
--------------------------------------------
| workerUUID | City | Debt | TestN| LName|
|------------------------------------------|
| 1234 | SF | 100k | 23 | Nil |
|-------------------------------------------
| 6789 | NY | 150k | 34 | Fa |
|------------------------------------------|
| 1234 | SF | 10k | 45 | Na |
--------------------------------------------
| 6789 | NY | 1k | 13 | Nil |
|-------------------------------------------
| 6789 | SF | 150k | 34 | Nil |
|------------------------------------------|
| 8999 | IN | 10k | 45 | Na |
--------------------------------------------
Basically I want to do something like
select City, Debt, TestN where workerUUID = '1234'
select City, Debt, TestN where workerUUID = '6789'
select City, Debt, TestN where workerUUID = '8999'
To clarify further, I want to generate temporary tables like
| workerUUID | City | Debt | TestN|
|------------------------------------
| 1234 | SF | 100k | 23 |
|------------------------------------
| 1234 | SF | 10k | 45 |
|-----------------------------------|
| workerUUID | City | Debt | TestN|
|------------------------------------
| 6789 | NY | 150k | 23 |
|------------------------------------
| 6789 | NY | 1k | 13 |
|------------------------------------
| 6789 | NY | 150k | 34 |
|-----------------------------------
| workerUUID | City | Debt | TestN|
|------------------------------------
| 8999 | IN | 10k | 45 |
etc
for all the unique value of workerUUIDs generated in the 3 hour gap

SQL query based on a column in parent table - Parent child relationship

I have the following three TABLES(ACCOUNTS,CUSTOMER,EMPLOYEE) and I would like to join them based on the columns AGENT_CODE & AGENT_TYPE and achieve the below.
What should be the best way to join these tables when AGENT_CODE can be same in CUSTOMER & EMPLOYEE table?
I have this query which is giving me wrong results
SELECT ac.AGENT_CODE,
ac.WORKING_AREA,
ac.AGENT_TYPE,
CONCAT(c.FIRST_NAME,c.LASTNAME_NAME),
e.EMP_NAME
FROM ACCOUNTS ac,
CUSTOMER c,
EMPLOYEE e
WHERE ac.AGENT_CODE = e.AGENT_CODE
OR ac.AGENT_CODE = c.AGENT_CODE
GETTING_WRONG_RESULTS_WITH_THE_ABOVE_QUERY
+------------+--------------------+------------+--------------+--------------+
| AGENT_CODE | WORKING_AREA | AGENT_TYPE | CUSTOMER_NAME| EMP_NAME |
+------------+--------------------+------------+--------------+--------------+
| A007 | Bangalore | CUSTOMER |Walter Holmes |Walter Holmes |
| A007 | London | EMPLOYEE |Walter Holmes |Peter Sam |
| A008 | New York | CUSTOMER |Micheal Junior|Micheal Junior|
| A007 | Bangalore | EMPLOYEE |Walter Holmes |John Tyler |
| A010 | Chennai | CUSTOMER |Micheal |Micheal |
| A007 | San Jose | EMPLOYEE |Walter Holmes |Albert |
+------------+--------------------+------------+--------------+--------------+
Expecting Result
+------------+--------------------+------------+--------------+
| AGENT_CODE | WORKING_AREA | AGENT_TYPE | AGENT_NAME |
+------------+--------------------+------------+--------------+
| A007 | Bangalore | CUSTOMER |Walter Holmes |
| A003 | London | EMPLOYEE |Peter Sam |
| A008 | New York | CUSTOMER |Micheal Junior|
| A011 | Bangalore | EMPLOYEE |John Tyler |
| A010 | Chennai | CUSTOMER |Micheal |
| A012 | San Jose | EMPLOYEE |Albert |
+------------+--------------------+------------+--------------+
ACCOUNTS(AGENT_CODE -PrimaryKey)
+------------+--------------------+------------+
| AGENT_CODE | WORKING_AREA | AGENT_TYPE |
+------------+--------------------+------------+
| A007 | Bangalore | CUSTOMER |
| A003 | London | EMPLOYEE |
| A008 | New York | CUSTOMER |
| A011 | Bangalore | EMPLOYEE |
| A010 | Chennai | CUSTOMER |
| A012 | San Jose | EMPLOYEE |
| A005 | Brisban | EMPLOYEE |
+------------+--------------------+------------+
CUSTOMER(AGENT_CODE -ForeignKey)
+-----------+-------------+-------------+------------+
|CUST_CODE | FIRST_NAME | LAST_NAME | AGENT_CODE |
+-----------+-------------+-------------+------------+
| C00013 | Walter | Holmes | A007 |
| C00001 | Micheal | Junior | A008 |
| C00020 | Albert | Skyler | A010 |
+-----------+-------------+-------------+------------+
EMPLOYEES(AGENT_CODE -ForeignKey)
EMP_NAME EMP_CODE AGENT_CODE
---------- --------------- ----------
Peter Sam C00054 A003
John Tyler C00023 A011
White Bolt C00043 A012
If you want to combine the result, you may want to UNION your result.
SELECT a.AGENT_CODE, a.WORKING_AREA, a.AGENT_TYPE, c.FIRST_NAME || ' ' || c.LAST_NAME AS AGENT_NAME
FROM ACCOUNTS a
JOIN CUSTOMER c ON c.AGENT_CODE = a.AGENT_CODE
UNION
SELECT a.AGENT_CODE, a.WORKING_AREA, a.AGENT_TYPE, e.EMP_NAME
FROM ACCOUNTS a
JOIN CUSTOMER e ON e.AGENT_CODE = a.AGENT_CODE

How to execute this query to compare dates

Write a query to display the students who are older than 'Balakrishnan'. Sort the results based on firstname in ascending order.
The output should look like this
+--------+-----------+----------+-------------+------------+-----------+
| STUDID | FIRSTNAME | LASTNAME | STREET | CITY | DOB |
+--------+-----------+----------+-------------+------------+-----------+
| 3009 | Abdul | Rahman | HAL | Bangalore | 19-JAN-88 |
| 3002 | Anand | Kumar | Indiranagar | Bangalore | 19-JAN-88 |
| 3001 | Dileep | Kumar | Jai Nagar | Bangalore | 10-MAR-89 |
| 3004 | Gowri | Shankar | Gandhipuram | Coimbatore | 22-DEC-87 |
| 3008 | John | Dravid | Mylapore | Chennai | 15-SEP-87 |
| 3006 | Prem | Kumar | Ramnagar | Coimbatore | 17-MAY-87 |
| 3007 | Rahul | Dravid | KKNagar | Chennai | 08-OCT-87 |
+--------+-----------+----------+-------------+------------+-----------+
Try this:-
It may be help you.
SELECT * FROM TABLE_NAME
WHERE DOB < TO_DATE('DOB_of_Balakrishnan','DD-MM-YYYY')
ORDER BY FIRSTNAME;
I am using oracle 11g.
As I see that the DOB of Balakrishnan is not provided...
try using this:
SELECT *
FROM table_name
WHERE dob<(SELECT dob
FROM table_name
WHERE LOWER(firstname)='bala')
ORDER BY firstname;

Is it possible to see the 'null' in the table in sql instead of blank

+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | |
| 7 | Muffy | 24 | Indore | |
+----+----------+-----+-----------+----------+
how to print the null instead of blank space in the above table in id 6,7 for salary column while inserting the values.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | null |
| 7 | Muffy | 24 | Indore | null |
+----+----------+-----+-----------+----------+
You can use ISNULL() or IFNULL() in your SELECT depending on the RDBMS. Your query would look something like this:
SELECT ID, NAME, AGE, ADDRESS, IFNULL(SALARY, "null") FROM YOURTABLE
Oracle equivalent of neelsg's answer,
SELECT id, name, age, address, NVL(salary, "null") FROM yourtable