I have a table which contains geometry lines (ways).There are lines that have a unique geometry (not repeating) and lines which have the same geometry (2,3,4 and more). I want to list only unique ones. If there are, for example, 2 lines with the same geometry I want to drop them. I tried DISTINCT but it also shows the first result from duplicated lines. I only want to see the unique ones.
I tried window function but I get similar result (I get a counter on first line from the duplicating ones). Sorry for a newbie question but I'm learning :) Thanks!
Example:
way|
1 |
1 |
2 |
3 |
3 |
4 |
Result should be:
way|
2 |
4 |
That actually worked. Thanks a lot. I also have other tags in this table for every way (name, ref and few other tags) When I add them to the query I loose the segregation.
select count(way), way, name
from planet_osm_line
group by way, name
having count(way) = 1;
Without "name" in the query I get all unique values listed but I want to keep "name" for every line. With this example I stilll get all the lines in the table listed.
To expound on #Nithila answer:
select count(way), way
from your_table
group by way
having count(way) = 1;
You first calculate the rows you want, and then search for the rest of the fields. So the aggregation doesnt cause you problems.
WITH singleRow as (
select count(way), way
from planet_osm_line
group by way
having count(way) = 1
)
SELECT P.*
FROM planet_osm_line P
JOIN singleRow S
ON P.way = S.way
you can group by way and while taking the data out check the count=1.It will give non duplicating data.
#voyteck
As I understood your question you need to get only non duplicating records of way column and for each row you need to show the name is it
If so, you have to put all the column in select statement, but no need to group by all the columns.
select count(way), way, name
from planet_osm_line
group by way
having count(way) = 1;
Related
I have two tables, one with a single row for each "batch_number" and another with defect details for each batch. The first table has a "defect_of_interest" column which I would like to link to one of the columns in the second table. I am trying to write a query that would then pick the maximum value in that dynamically linked column for any "unit_number" in the "batch_number".
Here is the SQLFiddle with example data for each table: http://sqlfiddle.com/#!9/a1c27d
For example, the maximum value in the DEFECT_DETAILS.SCRATCHES column for BATCH_NUMBER = A1 is 12.
Here is my desired output:
BATCH_NUMBER DEFECT_OF_INTEREST MAXIMUM_DEFECT_COUNT
------------ ------------------ --------------------
A1 SCRATCHES 12
B3 BUMPS 4
C2 STAINS 9
I have tried using the PIVOT function, but I can't get it to work. Not sure if it works in cases like this. Any help would be much appreciated.
If the number of columns is fixed (it seems to be) you can use CASE to select the specific value according to the related table. Then aggregating is simple.
For example:
select
batch_number,
max(defect_of_interest) as defect_of_interest,
max(defect_count) as maximum_defect_count
from (
select
d.batch_number,
b.defect_of_interest,
case when b.defect_of_interest = 'SCRATCHES' then d.scratches
when b.defect_of_interest = 'BUMPS' then d.bumps
when b.defect_of_interest = 'STAINS' then d.stains
end as defect_count
from defect_details d
join batches b on b.batch_number = d.batch_number
) x
group by batch_number
order by batch_number;
See Oracle example in db<>fiddle.
I am pulling data and when I pull in the text field my results for the "distinct ID" are sometimes being duplicated when there are multiple results for that ID. Is there a way to concatenate the results into a single column/row rather than having them duplicated?
It looks like there are ways in other SQL platforms but I have not been able to find something that works in HANA.
Example
Select
Distinct ID
From Table1
If I pull only Distinct ID I get the following:
ID
1
2
3
4
However when I pull the following:
Example
Select
Distinct ID,Text
From Table1
I get something like
ID
Text
1
Dog
2
Cat
2
Dog
3
Fish
4
Bird
4
Horse
I am trying to Concat the Text field when there is more than 1 row for each ID.
What I need the results to be (Having a "break" between results so that they are on separate lines would be even better but at least a "," would work):
ID
Text
1
Dog
2
Cat,Dog
3
Fish
4
Bird,Horse
I see Kiran has just referred to another valid answer in the comment, but in your example this would work.
SELECT ID, STRING_AGG(Text, ',')
FROM TABLE1
GROUP BY ID;
You can replace the ',' with other characters, maybe a '\n' for a line break
I would caution against the approach to concatenate rows in this way, unless you know your data well. There is no effective limit to the rows and length of the string that you will generate, but HANA will have a limit on string length, so consider that.
I'm looking to select the primary key of a row and I've only got a column that contains info (in a substring) that I need to select the row.
E.g. MyTable
ID | Label
------------
11 | 1593:#:#:RE: test
12 | 1239#:#:#some more random text
13 | 12415#:#:#some more random text about the weather
14 | 369#:#:#some more random text about the StackOverflow
The label column has always a delimiter of :#:#:
So really I guess, I'd need to be able to split this row by the delimiter, grab the first part of the label column (i.e. the number I'm looking) to get the id I wanted.
So, If I wanted row with ID of 14, then I'd be:
Select ID from MyTable
where *something* = '369'
Any ideas on how to construct something ..or how best to go about this:)
I'm completely stumped and haven't been able to find how to do this.
Thanks,
How about:
WHERE label LIKE '369#%'?
No reason to get fancy.
Although.. if you are going to do this search often, then maybe pre-split that value out to another column as part of your ETL process and index it.
I have a table that looks like so:
Is there any way I can display it like this in SSRS?
1 2 3 4
1 2.091 0.918 0.9 1.718
2 0.647 0.964 0.6 2.264
3 0.804 0.789 0.6 1.9
I tried using a matrix but it stops after showing the first column and doesn't expand to show the rest.
Attempt:
Results:
Another attempt:
Results:
to get something similar to the result you're looking for you need to write a query like:
select id
,row_number() over(partition by SubscaleNumber order by SubscaleNumber ) SubscaleNumber
, item
from yourtable
then the matrix should work fine.
You need to set the Row and Column grouping to get this matrix-style output.
When you click in your table you should see the brackets for the Row and Column groups. If you don't have a column group, add one.
Go into the properties of each group and make sure they are grouped by the column you want.
Start from scratch with a table or matrix.
Add a Row group on id
Add a Column Group on Item.
I have a Hive table, titled 'UK.Choices' with a column, titled 'Fruit', with each row as follows:
AppleBananaAppleOrangeOrangePears
BananaKiwiPlumAppleAppleOrange
KiwiKiwiOrangeGrapesAppleKiwi
etc.
etc.
There are 2.5M rows and the rows are much longer than the above.
I want to count the number of instances that the word 'Apple' appears.
For example above, it is:
Number of 'Apple'= 5
My sql so far is:
select 'Fruit' from UK.Choices
Then in chunks of 300,000 I copy and paste into Excel, where I'm more proficient and able to do this using formulas. Problem is, it takes upto an hour and a half to generate each chunk of 300,000 rows.
Anyone know a quicker way to do this bypassing Excel? I can do simple things like counts using where clauses, but something like the above is a little beyond me right now. Please help.
Thank you.
I think I am 2 years too late. But since I was looking for the same answer and I finally managed to solve it, I thought it was a good idea to post it here.
Here is how I do it.
Solution 1:
+-----------------------------------+---------------------------+-------------+-------------+
| Fruits | Transform 1 | Transform 2 | Final Count |
+-----------------------------------+---------------------------+-------------+-------------+
| AppleBananaAppleOrangeOrangePears | #Banana#OrangeOrangePears | ## | 2 |
| BananaKiwiPlumAppleAppleOrange | BananaKiwiPlum##Orange | ## | 2 |
| KiwiKiwiOrangeGrapesAppleKiwi | KiwiKiwiOrangeGrapes#Kiwi | # | 1 |
+-----------------------------------+---------------------------+-------------+-------------+
Here is the code for it:
SELECT length(regexp_replace(regexp_replace(fruits, "Apple", "#"), "[A-Za-z]", "")) as number_of_apples
FROM fruits;
You may have numbers or other special characters in your fruits column and you can just modify the second regexp to incorporate that. Just remember that in hive to escape a character you may need to use \\ instead of just one \.
Solution 2:
SELECT size(split(fruits,"Apple"))-1 as number_of_apples
FROM fruits;
This just first split the string using "Apple" as a separator and makes an array. The size function just tells the size of that array. Note that the size of the array is one more than the number of separators.
This is straight-forward if you have any delimiter ( eg: comma ) between the fruit names. The idea is to split the column into an array, and explode the array into multiple rows using the 'explode' function.
SELECT fruit, count(1) as count FROM
( SELECT
explode(split(Fruit, ',')) as fruit
FROM UK.Choices ) X
GROUP BY fruit
From your example, it looks like fruits are delimited by Capital letters. One idea is to split the column based on capital letters, assuming there are no fruits with same suffix.
SELECT fruit_suffix, count(1) as count FROM
( SELECT
explode(split(Fruit, '[A-Z]')) as fruit_suffix
FROM UK.Choices ) X
WHERE fruit_suffix <> ''
GROUP BY fruit_suffix
The downside is that, the output will not have first letter of the fruit,
pple - 5
range - 4
I think you want to run in one select, and use the Hive if UDF to sum for the different cases. Something like the following...
select sum( if( fruit like '%Apple%' , 1, 0 ) ) as apple_count,
sum( if( fruit like '%Orange%', 1, 0 ) ) as orange_count
from UK.Choices
where ID > start and ID < end;
instead of a join in the above query.
No experience of Hive, I'm afraid, so this may or may not work. But on SQLServer, Oracle etc I'd do something like this:
Assuming that you have an int PK called ID on the row, something along the lines of:
select AppleCount, OrangeCount, AppleCount - OrangeCount score
from
(
select count(*) as AppleCount
from UK.Choices
where ID > start and ID < end
and Fruit like '%Apple%'
) a,
(
select count(*) as OrangeCount
from UK.Choices
where ID > start and ID < end
and Fruit like '%Orange%'
) o
I'd leave the division by the total count to the end, when you have all the rows in the spreadsheet and can count them there.
However, I'd urgently ask my boss to let me change the Fruit field to be a table with an FK to Choices and one fruit name per row. Unless this is something you can't do in Hive, this design is something that makes kittens cry.
PS I'd missed that you wanted the count of occurances of Apple which this won't do. I'm leaving my answer up, because I reckon that my However... para is actually a good answer. :(