Oracle - Tree like query - sql

there is a table: Groups
It has three columns: ID, NAME, PARENT
This would flow in the following way.
Suppose there is a Group ELECTRONICS
Under ELECTRONICS, there is MOBILE
Under MOBILES there is SAMSUNG
Under SAMSUNG there is GALAXY EDGE
Under GALAXY EDGE there is 16GB and 8GB
the data in database would follow like:
ID NAME PARENT
1 ELECTRONICS null
2 MOBILE ELECTRONICS
3 SAMSUNG MOBILE
4 16GB SAMSUNG
5 8GB SAMSUNG
There may be N levels of hierarchy. I want to retrieve all the records of the last level.
In this case, return 16GB and 8GB.

The usual approach to such a problem is a recursive query. In Oracle this can be done using connect by.
However: to get all rows from the last level no recursive query is necessary.
Those are all rows that do not appear in the parent column:
select *
from Groups
where name not in (select parent
from groups g2
where g2.parent is not null);
SQLFiddle: http://sqlfiddle.com/#!4/df70d/1
A recursive query can be used to find e.g. all nodes below a certain category, e.g. if you want to find everything below SAMSUNG:
select *
from groups
start with name = 'SAMSUNG'
connect by prior name = parent;

Related

Apache Hive command

I have this question: Show the top 5 game Disciplines for the countries who got more than 10 gold medals.
my code is: select distinct t.discipline, m.team from teams t join medals m on (t.noc=m.team and m.numbergold>10) order by m.team;
cloud you please help me with this code to show the top 5 game for each country?
change the code to display 5 games for each country
Please use the group by and windows function for your required result.
In 'numbergold' - Use windows functions like dense_rank so that you can filter on the top 5.

query to get the results of each row in table1 with a subquery of N maximum records found to meet a condition in table2

I am trying without success to calculate building heights in my city using the LIDAR satellite dataset.
System specs
CPU: Core i7 6700k 4200MHz, 4 cores, 8 threads
RAM: 32GB DDR4 3200mhz
SSD: 1TB Samsung 970 EVO
OS: Ubuntu 18.04
Postgres setup
I am using the latest version of Postgres v12.1 database with PostGIS with the following tweaks recommended in different sources:
shared_buffers = 256MB
maintenance_work_mem = 4GB
max_parallel_maintenance_workers = 7
max_parallel_workers = 7
max_wal_size = 60GB
min_wal_size = 90MB
random_page_cost = 1.0
Database setup
In the lidar table I have more than 3000 million rows, and in the buildings table more than 150000 rows.
In the lidar table the GiST index was created: CREATE INDEX lidar_idx ON lidar USING GIST (geom);
building table: | gid | geom |
lidar table: | z | geom |
Height calculation
Currently in order to calculate the height of a building, it is necessary to check if each one of the 3000 million points (rows) is inside the area of each building and calculate the average of all the points found inside a building area.
The queries I have tried are taking forever (probably more than 5 days or even more) and I would like to simplify the query so that I can get the height of the building with a lot less points, without having to compare with all the insane 3000 million records each time for each building.
In example:
For building with id1, I would like to get only the first 100 records found which are inside the building geometry area ( ST_Within(l.geom, e.geom) ), and once those 100 records are found, pass to the next building.
For building with id2, I would like the same, get only the first 100 records found which are inside the building area.
And so on..
My main query is
SELECT e.gid, AVG(l.z) AS height
FROM lidar l,
buildings e
WHERE ST_Within(l.geom, e.geom)
GROUP BY e.gid) t
I have tried with another query, but I can not get it to work.
SELECT e.gid, AVG(l.z), COUNT(1) FILTER (WHERE ST_Within(l.geom, e.geom)) AS gidc
FROM lidar l, buildings e
WHERE gidc < 100
GROUP BY e.gid
I don't think you really want to do this at all. You should first try to make the correct query faster rather than compromising correctness by working with an arbitrary (but not random) subset of the data.
But if you do want it, then you can use a lateral join.
SELECT e.gid from
buildings e cross join lateral
(select AVG(l.z) AS height FROM lidar l WHERE ST_Within(l.geom, e.geom) LIMIT 100)
it is necessary to check if each one of the 3000million points (rows) is inside the area of each building and calculate the average of all the points found inside a building area.
This is exactly what a geometry index is for. You don't need to look at every point to get just the ones inside the a building area. If you don't have the right index, such as on lidar using gist (geom), then the lateral join query will also be awful.

How do I create a view on data

I have a table with 13 fields including:
Computer, Application
I need to have a similar table/view that lists COUNT(Application) along with Application listed only once for each Computer. All fields must exist with the addition of the new field.
I need something similar to:
Computer| Application | AppCount | ...
USD9090 MS Outlook 3
UOD0909 MS Outlook 5
UDL4563 Skype 4
Ive tried grouping by Computer with Application COUNT
SELECT TOP 5 Computer, ComputerID, Application FROM AppReliability WHERE EXISTS
(SELECT TOP 5 Count(Application) AS App, Computer
FROM AppReliability
WHERE Date >= DATEADD(day,-30,GETDATE())
GROUP BY Computer
ORDER BY App DESC)
I cant get the correct output
Do you just want an aggregation query?
select Computer, Application, count(*) as AppCount
from AppReliability
group by Computer, Application;
Your question doesn't mention anything about dates or why you would be using top (5).

Using Count and Group By in Power BI

I have a table that contains data about different benefit plans and users enrolled in one or more of those plans. So basically the table contains two columns representing the benefit plan counts and total users enrolled in those plans.
I need to create visualization in Power BI to represent the number of total users enrolled in 1 plan, 2 plans, 3 plans, ...etc.
I wrote the query in sql to get the desired result but not sure how do I do the same in power BI.
Below is my sql query:
SELECT S.PlanCount, COUNT(S.UserName) AS Participants
FROM (
SELECT A.Username, COUNT(*) AS PlanCount
FROM [dbo].[vw_BenefitsCount_Plan_Participants] AS A
GROUP BY A.username
)AS S
GROUP BY S.PlanCount
ORDER BY S.PlanCount
The query result is below image:
So here, PlanCount column represents the total different benefit plans that users are enrolled in. For e.g. the first row means that total of 6008 members are enrolled in only 1 plan, whereas row 2 displays that there are total of 3030 members who are enrolled in total of 2 plans and similarly row 5 means there are only 10 users who are enrolled in total of 6 plans.
I am new to Power BI and trying to understand DAX functions but couldn't find a reasonable example that could help me create my visualization.
I found a something similar here and here but they seem to be more towards single count and group by usage.
Here is a simple example. I have a table of home owners who have homes in multiple cities.
Now in this table, Alex, Dave and Julie have home in 1 city (basically we can say that these 3 people own just 1 home each). Similarly Jim owns a total of 2 homes and Bob and Pam each have 3 homes in total.
Now the output that I need is a table with total number of home owners that own 1 home, 2 homes and so on. So the resulting table in SQL is this.
Where NameCount is basically count of total home owners and Homes is the count of total homes these home owners have.
Please let me know if this helps.
Thanks.
If I understood fine, you have a table like this:
BenefitPlan | User
1 | Max
1 | Joe
2 | Max
3 | Anna
If it's ok, you can simply use a plot bar (for example) where the Axis is BenefitPlan and Value is User. When you drag some column in Value field, it will be grouped automaticaly (like group by in SQL), and by default the groupping method is count.
Hope it helps.
Regards.
You can use DAX to create a summary table from your data table:
https://community.powerbi.com/t5/Desktop/Creating-a-summary-table-out-of-existing-table-assistance/td-p/431485
Once you have counted plans by customer you will then have a field that will enable you to visualize the # of customers with each count.
Mock-up of the code:
PlanSummary = SUMMARIZE('vw_BenefitsCount_Plan_Participants',[Username],COUNT([PLAN_ID])

Access - Sort by a combination of fields

I had (see below) a table of (fictional) stars and then two tables showing its equally fictional satellites which could be planets or asteroid belts. tblStars, tblAsteroids and tblPlanets respectively. Each of the two satellite tables had a position field which was unique across the two tables - by this I mean that the star with ID 1 had only one satellite in position 1, 2 etc which could have been in either of those two tables but not both. I wanted to sort the satellites in order of position on my reports but couldn't see a way of sorting across the combination of those fields:
tblAsteroid:
Asteroid ID Position
1 1
2 3
tblPlanet:
Planet ID Position Biome
1 2 Ice
Giving:
Position AsteroidOrPlanet Biome
1 Asteroid N/A
2 Planet Ice
3 Asteroid N/A
For the avoidance of doubt, I recognise that this problem was caused by a flaw in my database design and I should have had a tblSatellite which contained that position and was in a 1 to many with tblStar and in 1 to 0-1's with tblAsteroid and tblPlanet. I've since fixed this, I'm just wondering if it would have been possible.
To get a combined list, you need a UNION query anyway. This you can sort by a common field.
SELECT Position, 'Asteroid' AS AsteroidOrPlanet, 'N/A' AS Biome
FROM tblAsteroid
UNION ALL
SELECT Position, 'Planet' AS AsteroidOrPlanet, Biome
FROM tblPlanet
ORDER BY Position