R - Using sqldf to query multiple values from one column in dataframe - sql

I'm pretty new to R and trying to use the SQLDF package to query a dataset. I have constructed the following query, which works perfectly and displays the correct data:
sqldf("select AreaName, TimePeriod, Value from df2 where Indicator == 'Obese children (Year 6)' AND AreaName == 'Barking and Dagenham'",
row.names = TRUE)
But I would like to pull the data for 'Richmond upon Thames' as well as Barking and Dagenham. I have tried this:
AND AreaName == 'Barking and Dagenham', 'Richmond upon Thames'
Which gives me the following error:
Error in sqliteSendQuery(con, statement, bind.data) : error in statement: near ",": syntax error
And I have also tried:
AND AreaName == 'Barking and Dagenham' AND AreaName == 'Richmond upon Thames'
Which creates the new dataframe as expected but when I view it, it is empty. I know it is not an issue with the name 'Richmond upon Thames' as I have entered this into the first statement by itself instead of 'Barking and Dagenham' and it works perfectly.
Could anybody help me with what the correct structure should be?
Many thanks

Related

Retrieve df from spark.sql : [PARSE_SYNTAX_ERROR] Syntax error at or near 'SELECT'

I'm using a databricks notebook and I'd like to retrieve a dataframe from an SQL execution in Spark. I have:
statement = f""" USER {db}; SELECT * FROM {table}
"""
df = spark.sql(statement)
display(df)
However, unlike when I fire off the same statement in an SQL cell in the notebook, I get the following error:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'SELECT': extra input 'SELECT'(line 1...
Where am I going wrong?
I tried to reproduce the same in my environment and got below results:
This my sample demo table Persons.
Create dataframe by using this code as shown in the below image.
df = sqlContext.sql("select * from Persons")
display(df)

Writing where query using pyspark on SQL table

I'm querying sql table using pyspark.
If I have a sql table which has two column (value, isDelayed) where "value" is of double type and "isDelayed" has value 0 or 1. How to write a query using pyspark aggregation query which gives sum of "value" when "isDelayed" is 1.
I've already tried below code which is giving an error
def __main__(self, data):
delayedData = data.where(col('isDelayed').cast('int')==='1')
groupByIsDelayed = delayedData.agg(sum(total))
return groupByIsDelayed
I'm getting
"Syntax Error: invalid syntax"
on below line
delayedData = data.where(col('isDelayed').cast('int')==='1')
replace data.where(col('isDelayed').cast('int')==='1') with data.where(col('isDelayed').cast('int') == 1)
2 = only (equal operator in python is 2 = sign)
1 without quote (because you compare a int, not a string)
or
data.where("isDelayed=1")

How to use sql statement in django?

I want to get the latest date from my database.
Here is my sql statement.
select "RegDate"
from "Dev"
where "RegDate" = (
select max("RegDate") from "Dev")
It works in my database.
But how can I use it in django?
I tried these codes but it return error. These code are in views.py.
Version 1:
lastest_date = Dev.objects.filter(reg_date=max(reg_date))
Error:
'NoneType' object is not iterable
Version 2:
last_activation_date = Dev.objects.filter(regdate='reg_date').order_by('-regdate')[0]
Error:
"'reg_date' value has an invalid format. It must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."
I've defined reg_date at beginning of the class.
What should I do for this?
You make things too complicated, you simply order by regdate, that's all:
last_activation_dev = Dev.objects.order_by('-regdate').first()
The .first() will return such Dev object if it is available, or None if there are no Dev objects.
If you only are interested in the regdate column itself, you can use .values_list(..):
last_activation_date = Dev.objects.order_by('-regdate').values_list('regdate', flat=True).first()
By using .filter() you actually were filtering the Dev table by Dev records such that the regdate column had as value 'reg_date', since 'reg_date' is not a valid datetime format, this thus produced an error.

Unrelated Column reference with filter syntax error

Im using SSAS Tabular. Trying to insert a column which gets data(OrgNumber) from an unrelated table called DimCustomer.
DAX-Syntax:
=Calculate(Values('DimCustomer'[OrgNum]),FILTER('DimCustomer','DimCustomer'[CustomerNr]='FactTransactions'[CustomerNr])))
Throws back error msg:
The syntax for 'FILTER' is incorrect.
The calculated column 'FactTransactions[CalculatedColumn1]' contains a syntax error. Provide a valid formula.
Try this:
=LOOKUPVALUE('DimCustomer'[OrgNum], 'DimCustomer'[CustomerNr], 'FactTransactions'[CustomerNr])
This assumes it is a calculated column on FactTransactions
I laid out your code like the below and it seems you have an extra bracket:
=Calculate
(
Values('DimCustomer'[OrgNum]),
FILTER
(
'DimCustomer',
'DimCustomer'[CustomerNr]='FactTransactions'[CustomerNr]
)
)
)

Multiple parameter values

I have a problem with BIRT when I try to pass multiple values from report parameter.
I'm using BIRT 2.6.2 and eclipse.
I'm trying to put multiple values from cascading parameter group last parameter "JDSuser". The parameter is allowed to have multiple values and I'm using list box.
In order to be able to do that I'm writing my sql query with where-in statement where I replace text with javascript. Otherwise BIRT sql can't get multiple values from report parameter.
My sql query is
select jamacomment.createdDate, jamacomment.scopeId,
jamacomment.commentText, jamacomment.documentId,
jamacomment.highlightQuote, jamacomment.organizationId,
jamacomment.userId,
organization.id, organization.name,
userbase.id, userbase.firstName, userbase.lastName,
userbase.organization, userbase.userName,
document.id, document.name, document.description,
user_role.userId, user_role.roleId,
role.id, role.name
from jamacomment jamacomment left join
userbase on userbase.id=jamacomment.userId
left join organization on
organization.id=jamacomment.organizationId
left join document on
document.id=jamacomment.documentId
left join user_role on
user_role.userId=userbase.id
right join role on
role.id=user_role.roleId
where jamacomment.scopeId=11
and role.name in ( 'sample grupa' )
and userbase.userName in ( 'sample' )
and my javascript code for that dataset on beforeOpen state is:
if( params["JDSuser"].value[0] != "(All Users)" ){
this.queryText=this.queryText.replaceAll('sample grupa', params["JDSgroup"]);
var users = params["JDSuser"];
//var userquery = "'";
var userquery = userquery + users.join("', '");
//userquery = userquery + "'";
this.queryText=this.queryText.replaceAll('sample', userquery);
}
I tryed many different quote variations, with this one I get no error messages, but if I choose 1 value, I get no data from database, but if I choose at least 2 values, I get the last chosen value data.
If I uncomment one of those additional quote script lines, then I get syntax error like this:
The following items have errors:
Table (id = 597):
+ An exception occurred during processing. Please see the following message for details: Failed to prepare the query execution for the
data set: Organization Cannot get the result set metadata.
org.eclipse.birt.report.data.oda.jdbc.JDBCException: SQL statement does not return a ResultSet object. SQL error #1:You have an error in
your SQL syntax; check the manual that corresponds to your MySQL
server version for the right syntax to use near 'rudolfs.sviklis',
'sample' )' at line 25 ;
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to
your MySQL server version for the right syntax to use near
'rudolfs.sviklis', 'sample' )' at line 25
Also, I should tell you that i'm doing this by looking from working example. Everything is the same, the previous code resulted to the same syntax error, I changed it to this script which does the same.
The example is available here:
http://developer.actuate.com/community/forum/index.php?/files/file/593-default-value-all-with-multi-select-parsmeter/
If someone could give me at least a clue to what I should do that would be great.
You should always use the value property of a parameter, i.e.:
var users = params["JDSuser"].value;
It is not necessary to surround "userquery" with quotes because these quotes are already put in the SQL query arround 'sample'. Furthermore there is a mistake because userquery is not yet defined at line:
var userquery = userquery + users.join("', '");
This might introduce a string such "null" in your query. Therefore remove all references to userquery variable, just use this expression at the end:
this.queryText=this.queryText.replaceAll('sample', users.join("','"));
Notice i removed the blank space in the join expression. Finally once it works finely, you probably need to make your report input more robust by testing if the value is null:
if( params["JDSuser"].value!=null && params["JDSuser"].value[0] != "(All Users)" ){
//Do stuff...
}