Can anyone help me with preparing data for the new feature Geo Maps. I want to show the below data on geo maps.
Country Name Sales
Russia 1244
Canada 3553
Germany 5345
Australia 2456
France 2566
United Kingdom 6743
India 3677
United States 5633
Thanks in advance,
the setup is quite easy and you can find out more information here:
https://developer.gooddata.com/article/setting-up-data-for-geo-charts
Basically it is about setting up the correct date type for columns that represents geo-information.
JT
Related
Have a table based on Covid-19 data from the Department of Health (UK).
So far so nice. But I want to create an extra column for the percentage change. I am still learning. I'll add the data although it has been modified - there is a question about how I did that in Pandas. How to transform data in Pandas with categoricals and timestamps for use in tableau
Update
I know I have to switch year data to create column.
Type 1 Acute? NHS England Region Code Name Date Value
0 No East of England 8D379 CARETECH COMMUNITY SERVICES LTD 2020-08-01 0.0
1 No East of England 8D379 CARETECH COMMUNITY SERVICES LTD 2020-08-02 0.0
2 No East of England 8D379 CARETECH COMMUNITY SERVICES LTD 2020-08-03 0.0
3 No East of England 8D379 CARETECH COMMUNITY SERVICES LTD 2020-08-04 0.0
4 No East of England 8D379 CARETECH COMMUNITY SERVICES LTD 2020-08-05 0.0
I tried this combination. Would be too confusing for most people...
Any analysis in Tableau can be expressed in terms of percentages. For example, rather than viewing sales for every product, you might want to view each product’s sales as a percentage of the total sales for all products.
Read about % change
We have a product usage table for software. It has 4 fields, [product name], [usage month], [users] and [Country]. We must report the data by Country and Product Name for licensing purposes. Our rule is to report the second highest number of users per country for each product. The same products can be used in all countries. It based on monthly usage numbers, so second peak usage for fy 2020. Since all of the data is in one table I am having trouble figuring out the SQL to get the information I need from the table.
I am thinking I need to do multiple selects (inner select? ) and group the data in a way to pull out the product name, peak usage and country. But that is where I am getting confused as to the best approach.
Example Data looks like this:
[product name], [usage month], [users], [Country]
Product1 January 831 United States of America
Product1 December 802 United States of America
Product1 September 687 United States of America
Product1 August 407 United States of America
Product1 July 799 United States of America
Product1 June 824 United States of America
Product1 April 802 United States of America
Product1 May 796 United States of America
Product1 February 847 United States of America
Product1 March 840 United States of America
Product1 November 818 United States of America
Product1 October 841 United States of America
Product2 March 1006 United States of America
Product2 February 1076 United States of America
Product2 April 890 United States of America
Product2 May 831 United States of America
Product2 September 538 United States of America
Product2 October 1053 United States of America
Product2 July 673 United States of America
Product2 August 87 United States of America
Product2 November 994 United States of America
Product2 January 1042 United States of America
Product2 December 952 United States of America
Product2 June 873 United States of America
I had originally thought about breaking this out into multiple tables and then trying sql against each product table, but since this is something I will need to do monthly, I didn't want to redesign the ETL that loads the data because 1) I don't control that ETL and 2) I felt like that would be a move backwards for a repetitive task. We were also looking into Power BI to do this for us, but haven't foound the right approach, and I would honestly rather have this in SQL.
If I follow you correctly:
select *
from (
select t.*,
row_number() over(partition by product_name, country order by users desc) rn
from mytable t
) t
where rn = 2
This generates one row per product and country, that corresponds to the second highest number of users.
For one country it should be fairly simple. This is off the top of my head, but a bit of tweaking should do it. This comes from your table names, which is likely way off (right?).
SELECT top 2 users
FROM ProductCounts
WHERE County = #Country
ORDER BY users DESC
LIMIT 1;
I don't really get a sense of how your data is entered to get a good feel of a better way to store the data to get the information you desire for your report.
You can use this, it returns the second highest user count grouped by first country and second product. Take as note that when there is only 1 user count per country and product the it will not show up, there have to be at least two user counts per country and product.
SELECT
country, product, users
FROM
ProductCounts
WHERE
(SELECT COUNT(*) FROM ProductCounts AS p
WHERE
p.country = ProductCounts.country
AND
p.product = ProductCounts.product
AND
p.users >= ProductCounts.users ) = 2
GROUP BY
country, product
How can I find whether a city in US belongs to east or west coast by using its latitude and longitude?
I have the lat and long for the city in US but could not find if it belongs to east or west
Here's a map showing latitudes. Decide where you want to cutover.
https://www.worldatlas.com/webimage/countrys/usalats.htm
Maybe > 115 could be west coast, < 95 east coast?
In a very simple workbook load data into an Sframe named "Students".
When I execute "Students" I get the expected results (just cut and pasted not actual results)
First Name Last Name Country age
Bob Smith United States 24
Alice Williams Canada 23
Malcolm Jone England 22
Felix Brown USA 23
Alex Cooper Poland 23
Tod Campbell United States 22
Derek Ward Switzerland 25
[7 rows x 4 columns]
But when I enter "Students.explore()" I get the results
"Materializing SFrame"
I expected a GUI with a rich display describing the data. This is what I get when I use graphlab.create in a non - Google Collaboratory workbook.
Below is the code description and link to the turicreate API help.
"SFrame.explore([title]) Explore the SFrame in an interactive" GUI.https://apple.github.io/turicreate/docs/api/turicreate.visualization.html
Google Colab is run in the Cloud. So, it can't open a new app window on your computer.
You may want to try Local Runtime
i read the book "Sams Teach Yourself SQL in 10 minutes, Third Edition" and in the lesson 10 "Grouping Data", section "Creating Groups", i can't understand the following:
"Aside from the aggregate calculations statements, every column in your SELECT statement must be present in the GROUP BY clause."
Why? I tried this and i think that it is not true.
For example, consider a table 'World' with the columns 'continent', 'country', 'population'.
SELECT continent, country
FROM World
GROUP BY continent;
According to the book, this should lead to an error, right? But it doesn't. I can group my data depending on the continent (so we have at the results 7 continents) and next to each continent, a random country name.
Like this
continent country
North America Canada
South America Brazil
Europe France
Africa Cameroon
Asia Japan
Australia New Zealand
Antarctica TuxLand
You are most probably using MySQL which allows ungrouped and unaggregated expressions in SELECT clause.
This is violation of standard of course.
This is intended to simplify GROUP BY with joins on a PRIMARY KEY:
SELECT a.*, SUM(b.value)
FROM a
JOIN b
ON b.a_id = a.id
GROUP BY
a.id
Normally, you would have either to add all columns from a into the GROUP BY clause or use a subquery.
MySQL allows you not to do it since all values from a are guaranteed to be the same for a given value of the PRIMARY KEY (which is grouped on).
This is correct and should produce no error in some forms of SQL such as MySQL. You may optionally use the GROUP BY statement on more than one column but it's not required.
GROUP BY will list the first result of the columns specified - so in your case, it would return the first country/continent pair.
PostgreSQL and MySQL allow this, using one field for the group by.
The tutorial probably assumes you should use GROUP BY on all fields so from what you select, you don't lose any data - it would show every country/continent in the above example, but only once.
Here's an example table:
Continent | Country | Random_Field
---------------------------------------------
North America Canada Cake
North America Canada Dog
South America Brazil Cat
Europe France Frog
Africa Cameroon House
Asia Japan Gadget
Asia India Dance
Australia New Zealand Frodo
Antarctica TuxLand Linux
In your first statement:
SELECT continent, country
FROM World
GROUP BY continent;
The output would be:
Continent | Country
--------------------------
North America Canada
South America Brazil
Europe France
Africa Cameroon
Asia Japan
Australia New Zealand
Antarctica TuxLand
Notice one of the Asia rows was lost, despite being different.
Using a GROUP BY on both:
SELECT continent, country
FROM World
GROUP BY continent, country;
Would yield:
Continent | Country
-----------------------------
North America Canada
South America Brazil
Europe France
Africa Cameroon
Asia Japan
Asia India
Australia New Zealand
Antarctica TuxLand