how to query dataframe in for loop with changing values, ValueError: Lengths must match to compare? - pandas

This works, but I had to type 4 different times for each value in charge_names list
charge_names = ['Vehicle Theft','Robbery','Burglary','Receive Stolen Property']
charges[charges['Charge Group Description']== 'Vehicle Theft'].head(2)
I tried to run for loop like this:
charge_names = ['Vehicle Theft','Robbery','Burglary','Receive Stolen Property']
for name in charge_names:
charges[charges['Charge Group Description']== name].head(2)
but not much success.
this is not working:
charges[['Charge Group Description'].isin(['Robbery', 'Burglary'])]
How can I query for all 4 values in the charge_names list in one line?

DataFrame.isin Whether each element in the DataFrame is contained in values.
DataFrame.groupby Group DataFrame based on entries
charge_names = ['Vehicle Theft','Robbery','Burglary','Receive Stolen Property']
charges[charges['Charge Group Description'].isin(charge_names)].groupby('Charge Group Description').head(2)

Related

How to groupby in subquery SQL

I have the following query:
SELECT "IMPORTACIONCOLUMN3", "role", COUNT("IMPORTACIONCOLUMN1")
FROM MYTABLE
WHERE ID = 9
GROUP BY "IMPORTACIONCOLUMN3", "role"
This gives me the following result:
I would like to achieve the following:
The "unique" values are grouped together (i.e. Instead of having 4 values of "Robot 1" these are grouped togehter in just 1 cell summing the count values.
The second group by or subquery has to be the same count, but with role instead of "IMPORTACIONCOLUM3"
Is it possible (for the second picture) to "link" the values either by index or adding an extra column to reference them (i.e. There's two "Solicitante" with a count value of "52" but it refers to "Robot 1" and other to "Solicitante" with count value of "58" links to "Robot 2"
The second image represent visually what I'm trying to explain.
I have been trying on my own but only have reached the following:
select "IMPORTACIONCOLUMN3", count("IMPORTACIONCOLUMN1")
from
(
select "IMPORTACIONCOLUMN1", count("role"), "IMPORTACIONCOLUMN3"
from MYTABLE
WHERE ID = 9
group by "IMPORTACIONCOLUMN1", "IMPORTACIONCOLUMN3"
) as tmp
group by "IMPORTACIONCOLUMN3"
But it is not yet the result I am looking for.
Thanks in advance for your help and tips!
EDIT:
Explaining my desired output in detail
Each one of "Robot 1, 2, 3" have roles such as "Solicitante", "Gerente", etc. with different values.
i.e. The first row "Humano" value "243" is the sum of "Agente de Compras - 95", "Gerente Financiero - 37", "Gerente Solicitante - 45", "Proovedor - 31", "Solicitante - 60".
I am linking these by the column "GRAFICOCOLUMNARECURSIVOID" with contains the index of whatever "Robot" these "roles" are from.
I would like to achieve a query containing subquerys that allows me to have this output.
Try this for question number 1:
SELECT "IMPORTACIONCOLUMN3", COUNT("IMPORTACIONCOLUMN1")
FROM MYTABLE
WHERE ID = 9
GROUP BY "IMPORTACIONCOLUMN3"
the problem is Role: Robot 1 have 4 roles
and this for question 2:
SELECT "role", COUNT("IMPORTACIONCOLUMN1")
FROM MYTABLE
WHERE ID = 9
GROUP BY "role"
Question 3, I don't understand what you are asking for. Please make an example.

multiplie outputs with different wheres

What do I have to change to get different results from different names.The table should give me the debts of each of them, this is calculated by the amount and the price of the drink. Now it should show all the names with the corresponding invoice that happens after the select
%sql select name, sum(getraenk.preis*schulden.menge) schulden from schulden \
join person on (fk_person = person.id)\
join getraenk on (fk_getraenk = getraenk.id)\
where name like ("dani")
Edit: it should spend all the names with their debts, that is:
dani = 8.5
michael = 12.5
...
Just in case your problem is very simple, you should be able to see all names and values with an SQL that looks like this:
select name, getraenk.preis*schulden.menge schulden
from schulden
join person on (fk_person = person.id)
join getraenk on (fk_getraenk = getraenk.id)
Note that I removed the where clause... this was the part that limited it to one name.
You also don't need the sum clause here unless you are doing a group by
Have you considered simply using GROUP BY name at the end of this query?
https://www.w3schools.com/sql/sql_groupby.asp
This will give you the sum of total debt for all names in your table which sounds like the result you are looking for.
You're missing
GROUP BY name
in the query.

Pig script to get top 3 data in a single record

I have the sample data as
user_id, date, accessed url, session time
the data refers to the top 3 interests of the user depending on the session time.
Got the data using the code:
top3 = FOREACH DataSet{
sorted = ORDER DataSet BY sessiontime DESC;
lim = LIMIT sorted 3;
GENERATE flatten(group), flatten(lim);
};
Output:
(1,20,url1,2484)
(1,20,url2,1863)
(1,20,url3,1242)
(2,22,url4,484)
(2,22,url5,63)
(2,22,url6,42)
(3,25,url7,500)
(3,25,url8,350)
(3,25,url9,242)
But I want my output to be like this:
(1,20,url1,url2,url3)
(2,22,url4,url5,url6)
(3,25,url7,url8,url9)
Please help.
You are close. The problem is that you FLATTEN the bag of URLs when you really want to keep them all in one record. So do this instead:
top3 = FOREACH DataSet{
sorted = ORDER DataSet BY sessiontime DESC;
lim = LIMIT sorted 3;
GENERATE flatten(group), lim.url;
};
Based on the output you got, you will now get
(1,20,{(url1),(url2),(url3)})
(2,22,{(url4),(url5),(url6)})
(3,25,{(url7),(url8),(url9)})
Note that the URLs are contained inside a bag. If you want to have them as three top-level fields, you will need to use a UDF to convert a bag into a tuple, and then FLATTEN that.

Compare two rows in AND condition

just having a problem using the AND operator in SQL as it returns a zero result set.
I have the following table structure:
idcompany, cloudid, cloudkey, idsearchfield, type, userValue
Now I execute the following statement:
SELECT *
FROM filter_view
WHERE
(idsearchfield = 4 and compareResearch(userValue,200) = true)
AND (idsearchfield = 6 and compareResearch(userValue,1) = true)
compareResearch ist just a function that casts the userValue and compares it to the other value and returns true if the value is equal or greater. UserValue is actually stored as a string (that's a decision made 6 years ago)
Okay, I get a zero resultset which is because both criterias in braces () are AND combined and one row can only have one idsearchfield and therefor one of the criterias won't match.
How do I get around this? I NEED the AND Comparison, but it won't work out this way.
I hope my problem is obvious :-)
If you've recognised that both conditions can't ever both be true, in what way can the AND comparison be the correct one?
select *
from filter_view
where (idsearchfield = 4 and compareResearch(userValue,200) = true)
OR (idsearchfield = 6 and compareResearch(userValue,1) = true)
This will return 2 rows (or more). Or are you looking for some way to correlate these two rows so that they appear as a single row?
Okay, so making a tonne of assumptions, because you haven't included enough information in your question.
filter_view returns a number of columns, one of which is some form of record identifier (lets call that ID). It also includes the aforementioned idsearchfield and userValue columns.
What you actually want to find is those id values, for which one row of filter_view has idsearchfield = 4 and compareResearch(userValue,200) = true and another row of filter_view has idsearchfield = 6 and compareResearch(userValue,1) = true
The general term for this is "relational division". In this simple case, and assuming that id/idsearchfield are unique in this view, we can answer it with:
select id,COUNT(*)
from filter_view
where (idsearchfield = 4 and compareResearch(userValue,200) = true)
OR (idsearchfield = 6 and compareResearch(userValue,1) = true)
group by id
having COUNT(*) = 2
If this doesn't answer your question, you're going to have to add more info to your question, including sample data, and expected results.

Shifting mysql database values from a couple of columns to rows of entries

Easier to describe by showing a simplified view of the existing data structure and the desired result...
CURRENTLY...
Element Response ElementType ElementNumber
EntryVal.1 1234.56 EntryVal 1
EntryDes.1 'Current Value' EntryDes 1
EntryVal.2 4321.0 EntryVal 2
EntryDes.2 'Another Value' EntryDes 2
EntryVal.3 6543.21 EntryVal 3
EntryDes.3 'Final Value' EntryDes 3
DESIRED...
Name Value
Current Value 1234.56
Another Value 4321.0
Final Value 6543.21
(split element column into ElementType and ElementNumber column in the hopes
it might help)
Have tried various sub-selects but have not found the secret.
Could do some looping in PHP but hope there is a more elegant sole single MySQL query approach.
There is other columns like location involved so trying to keep it clean.
Here's how I'd do it:
SELECT des.Response AS Name, val.Response AS Value
FROM MyTable AS des JOIN MyTable AS val USING (ElementNumber)
WHERE des.ElementType = 'EntryDes' AND val.ElementType = 'EntryVal';
Use:
SELECT MAX(CASE WHEN t.elementtype = 'EntryDes' THEN t.response END) AS Name,
MAX(CASE WHEN t.elementtype = 'EntryVal' THEN t.response END) AS Value,
FROM YOUR_TABLE t
GROUP BY t.elementnumber
You might want to keep elementnumber as a column, in case you need to ensure order.