I'm struggling with a query to solve this:
I have this data in athena AWS
name | hobby | food | other_col
cris | sports | pasta | sdsd
cris | music | pizza | qfrfe
cris | sports | pizza | dcfrfe
cris | sports | pizza | koioio
arnold | sports | pasta | joiuhiu
arnold | art | salad | ojouju
arnold | art | pasta | jiojo
jenny | dance | sushi | sdkwdk
jenny | dance | sushi | lkjlj
jenny | ski | pizza | sdkwdk
jenny | dance | pasta | jlkjlkj
And i need to get the most frequents values in columns group by name, something like this:
name | hobby | food |
cris | sports | pizza |
arnold | art | pasta |
jenny | dance | sushi |
Anyone could help me please?
You can use GROUPING SETS to determine the max counts in every group (name-hobby and name-food) and then use MAX_BY to get the maximum:
-- sample data
with dataset(name, hobby, food) as (
values ('cris' , 'sports', 'pasta'),
('cris' , 'music' , 'pizza'),
('cris' , 'sports', 'pizza'),
('cris' , 'sports', 'pizza'),
('arnold', 'sports', 'pasta'),
('arnold', 'art' , 'salad'),
('arnold', 'art' , 'pasta'),
('jenny' ,'dance' ,'sushi'),
('jenny' ,'dance' ,'sushi'),
('jenny' , 'ski' ,'pizza'),
('jenny' ,'dance' ,'pasta')
)
-- query
select name,
max_by(hobby, if(hobby is not null, cnt)) hobby,
max_by(food, if(food is not null, cnt)) food
from (select name,
hobby,
food,
count(*) cnt
from dataset
group by grouping sets ((name, hobby), (name, food)))
group by name;
Output:
name
hobby
food
jenny
dance
sushi
cris
sports
pizza
arnold
art
pasta
Another approach - use histogram and some map functions magic to determine maximum element in the map:
-- query
select name,
map_keys(
map_filter(
h,
(k, v) -> v = array_max(map_values(h))))[1] hobby,
map_keys(
map_filter(
f,
(k, v) -> v = array_max(map_values(f))))[1] food
from (select name,
histogram(hobby) h,
histogram(food) f
from dataset
group by name);
Related
Not sure if I have got the wording correct here, but I have this table:
| name | pets |
|-------|---------------|
| bob | cat, dog |
| steve | cat, parrot |
| dave | dog |
and I want it to become this:
| pet | names |
|--------|--------------|
| dog | bob, dave |
| cat | bob, steve |
| parrot | steve |
select regexp_split_to_table(pets, '\W\s') as pet
,string_agg(name, ', ') as names
from t
group by regexp_split_to_table(pets, '\W\s')
pet
names
cat
bob, steve
dog
bob, dave
parrot
steve
Fiddle
You can unnest the array and use a cross join to regroup:
select v, array_agg(t.name) from tbl t cross join unnest(t.pets) v group by v
See fiddle.
table 'product'
------------------------------------------
| id | product_name | product_description|
------------------------------------------
| 1. | abc | this is abc's desc |
------------------------------------------
Junction table 'ingredient_product'
------------------------------------------
| id | product_id | ingredient_id |
------------------------------------------
| 1 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 3 |
| 1 | 1 | 4 |
| 1 | 1 | 5 |
| 1 | 1 | 6 |
------------------------------------------
table 'ingredient'
------------------------
| id | ingredient_name |
------------------------
| 1 | apple |
| 2 | chicken |
| 3 | beef |
| 4 | beet |
| 5 | oat |
| 6 | pea fibre |
------------------------
I have 3 tables and tried to query like below
SELECT
product.name AS product_name,
product.description AS product_description,
product.created_at AS product_created,
ingredient.name AS ingredient_name
FROM
product
JOIN
ingredient_product
ON
ingredient_product.product_id = product.id
JOIN
ingredient
ON
ingredient.id = ingredient_product.ingredient_id
WHERE
ingredient_product.product_id = 1;
and I get the result like below
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "banana"}
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "strawberry"}
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "canola oil"}
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "pilchard"}
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "ground beef"}
{product_name: "fromm gold", product_description: "For puppies and pregnant or nursing mothers. Taste… aid digestion and salmon oil for a healthy coat.", ingredient_name: "cranberry"}
I get the all different ingredients but wanted to show only once for the duplicates.
is there better way to query this type?
Thank you in advance!
You seem to want aggregation, something like this:
SELECT p.name AS product_name, p.description AS product_description,
p.created_at AS product_created,
ARRAY_AGG(i.name) AS ingredient_names
FROM product p JOIN
ingredient_product ip
ON ip.product_id = p.id JOIN
ingredient i
ON i.id = ip.ingredient_id
WHERE p.id = 1
GROUP BY p.id;
Notes:
Table aliases make the query easier to write and read.
The WHERE clause filters on the primary key of product rather than on the equivalent column in ingredient_product. I think that primary keys may help the optimizer.
This adds a GROUP BY, because you want one row per product. This is aggregating by the primary key, so the SELECT can contain other columns.
The array_agg() brings the ingredients together as an array.
Do you want to list the product details only once, together with a list of ingredients?
Have a look at aggregate functions e.g. there: https://www.postgresql.org/docs/9.5/functions-aggregate.html .
I think function string_agg will help you.
If I have a student_grade_history table containing past grade history for students, for example:
student | course | year | grade
--------+----------+------------+----------
steve | Math1A | 2018spring | A+
steve | English | 2018spring | B-
steve | Science | 2018spring | B+
steve | Biology | 2018spring | C
and I would like to count the grade each student received A+, A , A- into grade A and B+,B,B- into grade B and C+,C,C- into grade C.
I am able to create a table like the following:
student | year | # of A received | # of B received | # of C received
--------+------------+-----------------+-----------------+-----------------
steve | 2018spring | 1 | 2 | 1
However, I am trying to creating a table that has the following format based on student_grade_history table, but I cannot think of a way to do it.
student | year | grade | count
--------+------------+-------+------
steve | 2018spring | A | 1
steve | 2018spring | B | 2
steve | 2018spring | C | 1
Can I get some hints how to approach this?
You seem to want aggregation:
select student, year, left(grade, 1) as grade, count(*)
from student_grade_history
group by student, year, left(grade, 1);
Not all databases support left() (but most do). All have the functionality, even if it goes by a slightly different syntax.
I have a very simple data set that I would like to be able to query and get the results as a single record.
Members Table
ID | FirstName | LastName | HeroName
42 | Bruce | Wayne | Batman
1337 | Bruce | Banner | Hulk
1033 | Clark | Kent | Newspaper Boy
Skills Tables
ID | Skill
42 | Martial Arts
42 | Engineering
42 | Intimidation
1337 | Anger Management
1337 | Thermo Nuclear Dynamics
1033 | NULL
I want the result to be
ID | FirstName | LastName | HeroName | Skill1 | Skill2 | Skill3 | ... | Skilln
42 Bruce | Wayne | Batman | Martial Arts | Engineering | Intimidation
The query I have so far is
SELECT m.ID, m.FirstName, m.LastName, m.HeroName, s.Skill
FROM Members m
JOIN Skills s
ON m.ID = s.ID
WHERE m.ID = 42 and s.Skill IS NOT NULL
which returns
ID | FirstName | LastName | HeroName | Skill
42 | Bruce | Wayne | Batman | Martial Arts
42 | Bruce | Wayne | Batman | Engineering
42 | Bruce | Wayne | Batman | Intimidation
Short of iterating over the results and only extracting the fields I want is there a way to return this as a single record? I've seen topics on PIVOT, and XmlPath but from what I've read neither of these does quite what I want it to. I'd like an arbitrary number of Skills to be returned and no nulls are returned.
EDIT:
The problem with PIVOT is that it will turn one of the rows into a column header. If There is a way to fill in a generic column header than it might work.
I have three tables like this:
Person table:
person_id | name | dob
--------------------------------
1 | Naveed | 1988
2 | Ali | 1985
3 | Khan | 1987
4 | Rizwan | 1984
Address table:
address_id | street | city | state | country
----------------------------------------------------
1 | MAJ Road | Karachi | Sindh | Pakistan
2 | ABC Road | Multan | Punjab | Pakistan
3 | XYZ Road | Riyadh | SA | SA
Person_Address table:
person_id | address_id
----------------------
1 | 1
2 | 2
3 | 3
Now I want to get all records of Person_Address table but also with their person and address records like this by one query:
person_id| name | dob | address_id | street | city | state | country
----------------------------------------------------------------------------------
1 | Naveed | 1988 | 1 | MAJ Road | Karachi | Sindh | Pakistan
2 | Ali | 1985 | 2 | ABC Road | Multan | Punjab | Pakistan
3 | Khan | 1987 | 3 | XYZ Road | Riyadh | SA | SA
How it is possible using zend? Thanks
The reference guide is the best starting point to learn about Zend_Db_Select. Along with my example below, of course:
//$db is an instance of Zend_Db_Adapter_Abstract
$select = $db->select();
$select->from(array('p' => 'person'), array('person_id', 'name', 'dob'))
->join(array('pa' => 'Person_Address'), 'pa.person_id = p.person_id', array())
->join(array('a' => 'Address'), 'a.address_id = pa.address_id', array('address_id', 'street', 'city', 'state', 'country'));
It's then as simple as this to fetch a row:
$db->fetchRow($select);
In debugging Zend_Db_Select there's a clever trick you can use - simply print the select object, which in turn invokes the toString method to produce SQl:
echo $select; //prints SQL
I'm not sure if you're looking for SQL to do the above, or code using Zend's facilities. Given the presence of "sql" and "joins" in the tags, here's the SQL you'd need:
SELECT p.person_id, p.name, p.dob, a.address_id, street, city, state, country
FROM person p
INNER JOIN Person_Address pa ON pa.person_id = p.person_id
INNER JOIN Address a ON a.address_id = pa.address_id
Bear in mind that the Person_Address tells us that there's a many-to-many relationship between a Person and an Address. Many Persons may share an Address, and a Person may have more than one address.
The SQL above will show ALL such relationships. So if Naveed has two Address records, you will have two rows in the result set with person_id = 1.