I am wanting to add a extra 2 rows to my table for each part number which is present. Currently I have something like this:
+-------------+-----------+---------------+
| item_number | operation | resource_code |
+-------------+-----------+---------------+
| abc | 10 | kit |
| abc | 20 | build |
| abc | 30 | test |
+-------------+-----------+---------------+
There are hundreds of more items set up like this within the table. I am wanting to add 2 extra lines of records to the table based upon each part number. So once these have been added my data set will look like this:
+-------------+-----------+---------------+
| item_number | operation | resource_code |
+-------------+-----------+---------------+
| abc | 10 | kit |
| abc | 20 | build |
| abc | 30 | test |
| abc | NULL | NULL |
| abc | NULL | NULL |
+-------------+-----------+---------------+
I am wanting these new records to be blank for now and add to them later.
I am using access and looking for the sql to add these new records to the table.
Try this on for size:
INSERT INTO my_table
SELECT item_number, NULL AS operation, NULL AS resource_code
FROM my_table
GROUP BY item_number
UNION ALL
SELECT item_number, NULL AS operation, NULL AS resource_code
FROM my_table
GROUP BY item_number
Related
I have a table with 12,000 rows of data. The table is comprised of 7 columns of data (PIDA, NIDA, SIDA, IIPA, RPRP, IORS, DDSN) each column with 4 entry types ("Supported", "Not Supported", "Uncatalogued", or "NULL" entries)
+--------------+-----------+--------------+-----------+
| PIDA | NIDA | SIDA | IIPA |
+--------------+-----------+--------------+-----------+
| Null | Supported | Null | Null |
| Uncatalogued | Supported | Null | Null |
| Supported | Supported | Uncatalogued | Supported |
| Supported | Null | Uncatalogued | Null |
+--------------+-----------+--------------+-----------+
I would like to generate an output where each entry is counted for each column. Like column to row transpose.
+---------------+------+------+------+------+
| Categories | PIDA | NIDA | SIDA | IIPA |
+---------------+------+------+------+------+
| Supported | 10 | 20 | 50 | 1 |
| Non Supported | 30 | 50 | 22 | 5 |
| Uncatalogued | 5 | 10 | 22 | 22 |
| NULL | 10 | 11 | 22 | 22 |
+---------------+------+------+------+------+
Not having any luck with inline select or case statements. I have a feeling a little bit of both would be needed to first count and then list each as row in the output
Thanks all,
One option is to UNPIVOT your data and then PIVOT the results
Example
Select *
From (
Select B.*
From YourTable A
Cross Apply ( values (PIDA,'PIDA',1)
,(NIDA,'NIDA',1)
,(SIDA,'SIDA',1)
,(IIPA,'IIPA',1)
) B(Categories,Item,Value)
) src
Pivot ( sum(Value) for Item in ([PIDA],[NIDA],[SIDA],[IIPA] ) ) pvt
Results (with small sample size)
Categories PIDA NIDA SIDA IIPA
NULL 1 1 2 3
Supported 2 3 NULL 1
Uncatalogued 1 NULL 2 NULL
I have an Oracle table like this
| id | code | info | More cols |
|----|------|------------------|-----------|
| 1 | 13 | The Thirteen | dggf |
| 1 | 18 | The Eighteen | ghdgffg |
| 1 | 18 | The Eighteen | |
| 1 | 9 | The Nine | ghdfgjgf |
| 1 | 9 | Die Neun | ghdfgjgf |
| 1 | 75 | The Seventy-five | ghfgh |
| 1 | 75 | The Seventy-five | ghfgh |
| 1 | 2 | The Two | ghfgh |
| 1 | 27 | The Twenty-Seven | |
| 1 | 27 | The Twenty-Seven | |
| 1 | 27 | el veintisiete | fghfg |
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
In this table I want to find all rows with values in column code which have more than one distinct value in the info column. So from the listed rows this would be the values 9 and 27 and the associated rows.
I tried to construct a first query like
SELECT code FROM mytable
WHERE COUNT(DISTINCT info) >1
but I get a "ORA-00934: group function is not allowed here" error. Also I don't know how to express the condition COUNT(DISTINCT info) "with a fixed postcode".
You need having with group by - aggregate functions don't work with where clause
SELECT code
FROM mytable
group by code
having COUNT(DISTINCT info) >1
I would write your query as:
SELECT code
FROM yourTable
GROUP BY code
HAVING MIN(info) <> MAX(info);
Writing the HAVING logic this ways leaves the query sargable, meaning that an index on (code, info) should be usable.
You could also do this using exists logic:
SELECT DISTINCT code
FROM yourTable t1
WHERE EXISTS (SELECT 1 FROM yourTable WHERE t2.code = t1.code AND t2.info <> t1.info);
I have two tables. Like this.
select * from extrafieldvalues;
+----------------------------+
| id | value | type | idItem |
+----------------------------+
| 1 | 100 | 1 | 10 |
| 2 | 150 | 2 | 10 |
| 3 | 101 | 1 | 11 |
| 4 | 90 | 2 | 11 |
+----------------------------+
select * from items
+------------+
| id | name |
+------------+
| 10 | foo |
| 11 | bar |
+------------+
I need to make a query and get something like this:
+--------------------------------------+
| idItem | valtype1 | valtype2 | name |
+--------------------------------------+
| 10 | 100 | 150 | foo |
| 11 | 101 | 90 | bar |
+--------------------------------------+
The quantity of types of extra field values is variable, but every item ALWAYS uses every extra field.
If you have only two fields, then left join is an option for this:
select i.*, efv1.value as value_1, efv2.value as value_2
from items i left join
extrafieldvalues efv1
on efv1.iditem = i.id and
efv1.type = 1 left join
extrafieldvalues efv2
on efv1.iditem = i.id and
efv1.type = 2 ;
In terms of performance, two joins are probably faster than an aggregation -- and it makes it easier to bring in more columns from items. One the other hand, conditional aggregation generalizes more easily and the performance changes by little as more columns from extrafieldvalues are added to the select.
Use conditional aggregation
select iditem,
max(case when type=1 then value end) as valtype1,
max(case when type=2 then value end) as valtype2,name
from extrafieldvalues a inner join items b on a.iditem=b.id
group by iditem,name
I am new to sparksql and i was trying to experiment certain queries with that.
This is the query i am trying to execute
sqlContext.sql(SELECT id , category ,AVG(mark) FROM data GROUP BY id, category)
I am not getting proper output when i run the query.
instead of actual value of category i am getting some value as 1,2,3.
I am stuck at this weird error for long time
but when i do simple select statement and one group by its working perfectly
sqlContext.sql(SELECT id , category FROM data)
sqlContext.sql(SELECT id ,AVG(mark) FROM data GROUP BY id)
What is wrong? Does SPARKSQL has something to do with multiple group by.
right now i am running this complex query
sqlContext.sql(SELECT data.id , data.category, AVG(id_avg.met_avg) FROM (SELECT id, AVG(mark) AS met_avg FROM data GROUP BY id) AS id_avg, data GROUP BY data.category, data.id)
This works, but taking a longer time to execute.
Please Help
Sample data:
|id | category | marks
| 1 | a | 40
| 2 | b | 44
| 3 | a | 50
| 4 | b | 40
| 1 | a | 30
The output should be:
|id | category | avg
| 1 | a | 35
| 2 | b | 44
| 3 | a | 50
| 4 | b | 40
Please try this query:
SELECT
data.id
, data.category
, AVG(mark)
FROM data
GROUP BY
data.id
, data.category
Based on this sample data:
|id | category | marks
| 1 | a | 40
| 2 | b | 44
| 3 | a | 50
| 4 | b | 40
| 1 | a | 30
The output WILL be this:
|id | category | avg
| 1 | a | 35
| 2 | b | 44
| 3 | a | 50
| 4 | b | 40
and, the following expected row cannot be produced using group by:
| 5 | a | 30
That is a bug in sparksql.
Try using the next version. Its fixed.
i got the proper output by using spark-1.0.2
it worked with pure scala code also. Try either of them :)
This should be a simple one, but say I have a table with data like this:
| ID | Date | Value |
| 1 | 01/01/2013 | 40 |
| 2 | 03/01/2013 | 20 |
| 3 | 10/01/2013 | 30 |
| 4 | 14/02/2013 | 60 |
| 5 | 15/03/2013 | 10 |
| 6 | 27/03/2013 | 70 |
| 7 | 01/04/2013 | 60 |
| 8 | 01/06/2013 | 20 |
What I want is the sum of values per week of the year, showing ALL weeks.. (for use in an excel graph)
What my query gives me, is only the weeks that are actually in the database.
With SQL you cannot return rows that don't exist in some table. To get the effect you want you could create a table called WeeksInYear with only one field WeekNumber that is an Int. Populate the table with all the week numbers. Then JOIN that table to this one.
The query would then look something like the following:
SELECT w.WeekNumber, SUM(m.Value)
FROM MyTable as m
RIGHT OUTER JOIN WeeksInYear AS w
ON DATEPART(wk, m.date) = w.WeekNumber
GROUP BY w.WeekNumber
The missing weeks will not have any data in MyTable and show a 0.