I have CWL Entries as below. Showing entries in SQL Type for clarity
Name City
1 Chicago
2 Wuhan
3 Chicago
4 Wuhan
5 Los Angeles
Now I want to get below output
City Count
Chicago 2
Wuhan 2
Los Angeles 1
Is there a way I can run GROUP BY in CWL Insights.
Pseudo Query
Select Count(*), City From {TableName} GROUP BY City
You can use the aggregation function count with the by statement: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
Here is a full example for your case, assuming the logs contain the entries exactly as you have in the example (regex for city name is very simple, you may want to refine that).
fields #timestamp, #message
| parse #message /^(?<number>\d+)\s+(?<city>[a-zA-Z\s]+)$/
| filter ispresent(city)
| stats count(*) by city
Result:
---------------------------
| city | count(*) |
|--------------|----------|
| Chicago | 2 |
| Wuhan | 2 |
| Los Angeles | 1 |
---------------------------
Related
I currently have a dataset that looks like this:
Personid | Question | Response
1 | Name | Daniel
1 | Gender | Male
1 | Address | New York, NY
2 | Name | Susan
2 | Gender | Female
2 | Address | Boston, MA
3 | Name | Leonard
3 | Gender | Male
3 | Address | New York, NY
I also have another table that looks like this (just the person id):
Personid
1
1
1
2
2
2
3
3
3
I want to write a query to return something like this:
Personid | Name | Gender | Address
1 |Daniel | Male | New York, NY
2 | Susan | Female | Boston, MA
3 |Leonard | Male | New York, NY
I think it's a mix of some sort of "transpose" (not sure if it's even available in SQL) and conditional statement on just the gender, but I'm having issues with getting the end result. Could anyone offer any advice?
Easiest way is just to link to the question table three times with different aliases.
select
p.person_id,
n.response as name,
g.response as gender,
a.response as address
from
person p
join question n
on n.personid = p.personid and n.question = 'Name'
join question g
on g.personid = p.personid and g.question = 'Gender'
join question a
on a.personid = p.personid and a.question = 'Address'
I'm assuming that your person table only has 3 rows not the 9 you've listed. if there are really 9, then just do a select distinct.
This is a textbook example of a pivot table. In postgresql it is implemented by the CROSSTAB function, which is available from the TABLEFUNC additional extension module.
If your need is really as simple as the provided MCVE, multiple JOIN’s might be enough, but in more complicated situations CROSSTAB is really the way to go, and worth the pain of installing an additional module, if it is not installed by default by your distro. In short, if your initial table is called dataset, and personid is an INT:
-- To execute as superuser. Be sure you have installed the extension
-- package. Execute once to install, it will stay in your database
-- ever since.
CREATE EXTENSION TABLEFUNC;
-- As normal user
SELECT * FROM CROSSTAB($$
SELECT personid, question, response FROM dataset
$$) AS ct(person INT, name TEXT, gender TEXT, address TEXT);
person | name | gender | address
--------+----------+---------+---------------
1 | Daniel | Male | New York, NY
2 | Susan | Female | Boston, MA
3 | Leonard | Male | New York, NY
(3 rows)
You can add WHERE clauses, JOIN with other tables, etc., according to your needs.
I have a table with data stored vertically, I have shown a simplified example below which has a record for each city a customer has lived in:
| CUSTOMER | CITY |
------------------------------
| John | London |
| John | Manchester |
| Sarah | Cardiff |
| Sarah | Edinburgh |
| Sarah | Liverpool |
| Craig | Manchester |
| Craig | London |
I am trying to come up with an SQL query that will return all unique combinations of cities so in the example above, John and Craig have both lived in London and Manchester but Sarah has lived in different cities (Cardiff, Edinburgh and Liverpool) so I would like an output as below (which can handle any amount of cities)
| CITY1 | CITY2 | CITY3 |
--------------------------------------------
| London | Manchester | |
| Cardiff | Edinburgh | Liverpool |
I have tried using a crosstab query to view the data horizontally like this:
TRANSFORM Max(City)
SELECT Customer
FROM tblCities
GROUP BY Customer
PIVOT City
but it is just returning a field for all cities for every customer. Does anyone know if this is possible using SQL?
p.s Ideally it will ignore the order of cities
This was a nice challenge! The query below gets the groupings per customer. It doesn't discard the duplicates where multiple customers have lived in the same combination of cities ... I'll let you or others find a way to handle that.
TRANSFORM Min(OrderedList.City) AS MinOfCity
SELECT OrderedList.Customer
FROM (SELECT CustomerCities.Customer, CustomerCities.City, Count(1) AS CityNo
FROM CustomerCities INNER JOIN CustomerCities AS CustomerCities_1 ON CustomerCities.Customer = CustomerCities_1.Customer
WHERE (((CustomerCities.City)>=[CustomerCities_1].[City]))
GROUP BY CustomerCities.Customer, CustomerCities.City) OrderedList
GROUP BY OrderedList.Customer
PIVOT "CITY" & [CityNo];
Is this what you want?
select distinct c1.city, c2.city
from tblCities as c1 inner join
tblCities as c2
on c1.customer = c2.customer and c1.city < c2.city;
This returns all pairs of cities that appear for any single customer.
Here is a query which might work assuming each customer is only associated with two cities:
SELECT DISTINCT t.city_1, t.city_2
FROM
(
SELECT MIN(CITY) AS city_1, MAX(CITY) AS city_2
FROM tblCities
GROUP BY CUSTOMER
) t
I have Two tables in Postgresql and I'm trying to get the number of times a hashtag is repeated by place.
I've made this query:
SELECT tweets_with_location.user_location,
tweets_with_location.my_new_id,
all_hashtags_with_location.regexp_split_to_table
FROM tweets_with_location, all_hashtags_with_location
WHERE tweets_with_location.my_new_id = all_hashtags_with_location.my_new_id;
Which returns the Location, the tweet id and the hashtag:
USER_LOCATION | MY_NEW_ID | HASHTAG
New York, NY | 33 | Happy
New York, NY | 40 | BigApple
Bronx, NY | 12 | Happy
Bronx, NY | 45 | Happy
Queens, NY | 23 | Trump
Queens, NY | 20 | Trump
Then, I've made another SQL Query but it seems it doesn't sums up the number of times a hashtag was displayed by place, the Count value is always 1:
SELECT tweets_with_location.user_location,
all_hashtags_with_location.regexp_split_to_table,
COUNT(DISTINCT all_hashtags_with_location.regexp_split_to_table) AS CountOf
FROM tweets_with_location, all_hashtags_with_location
WHERE tweets_with_location.my_new_id = all_hashtags_with_location.my_new_id
GROUP BY tweets_with_location.user_location,
all_hashtags_with_location.regexp_split_to_table
ORDER BY CountOf DESC;
I need is this result:
USER_LOCATION - HASHTAG - COUNT
New York, NY | Happy | 1
Bronx, NY | Happy | 2
Queens, NY | Trump | 2
New York, NY | Happy | 1
How do I do this? What is wrong with my SQL Query?
Or just remove the DISTINCT qualifier in the COUNT() function.
You were really close, you are counting the wrong field:
SELECT tweets_with_location.user_location,
all_hashtags_with_location.regexp_split_to_table,
COUNT(DISTINCT tweets_with_location.my_new_id) AS CountOf
FROM tweets_with_location, all_hashtags_with_location
WHERE tweets_with_location.my_new_id = all_hashtags_with_location.my_new_id
GROUP BY tweets_with_location.user_location,
all_hashtags_with_location.regexp_split_to_table
ORDER BY CountOf DESC;
Is there a more efficient way of querying a table (or collection of table) for all possible combinations of a few columns, I'm currently running group by and then max, but this doesn't seem to be the most efficient way.
SQL Fiddle for the below example: http://sqlfiddle.com/#!2/25f8b/3
Example Table
ID | Name | Age | City | Color
--------------------------------
1 | Dave | 10 | London | Red
2 | Dave | 11 | London | Purple
3 | Dave | 10 | Paris | Orange
4 | Jim | 10 | London | Red
5 | Jim | 10 | London | Green
6 | Jim | 11 | London | Lazer
etc... (around 500,000 rows)
Currently doing:
SELECT max(ID), Name, Age, City, Color
from People
group by Name, Age, City
To produce:
MAX(ID) NAME AGE CITY COLOR
1 Dave 10 London Red
3 Dave 10 Paris Orange
2 Dave 11 London Purple
5 Jim 10 London Red
6 Jim 11 London Lazer
Note 4 is missing as it's a exact duplicate of 5
3 Is included as it has a different city to 1, even with same age/name
However currently on this massive database it takes around a ten minutes to return the results (note it's actually a join of a few tables)
Is there a more efficient way to return the same results? I was imagining a mass collection of SELECT * WHERE name = %, age = % and city = % LIMIT 1 or something similar
To get the different combinations there is as reserved word DISTINCT :
SELECT DISTINCT Name, Age, City
FROM People
This gives the same result as :
SELECT Name, Age, City
FROM People
GROUP BY Name, Age, City
However it is limited :
If you add a column (like Color), it is included in the combinations analysis
You can't use aggregate functions, like MAX
I don't know if it's any better performance wise
I have three tables like this:
Person table:
person_id | name | dob
--------------------------------
1 | Naveed | 1988
2 | Ali | 1985
3 | Khan | 1987
4 | Rizwan | 1984
Address table:
address_id | street | city | state | country
----------------------------------------------------
1 | MAJ Road | Karachi | Sindh | Pakistan
2 | ABC Road | Multan | Punjab | Pakistan
3 | XYZ Road | Riyadh | SA | SA
Person_Address table:
person_id | address_id
----------------------
1 | 1
2 | 2
3 | 3
Now I want to get all records of Person_Address table but also with their person and address records like this by one query:
person_id| name | dob | address_id | street | city | state | country
----------------------------------------------------------------------------------
1 | Naveed | 1988 | 1 | MAJ Road | Karachi | Sindh | Pakistan
2 | Ali | 1985 | 2 | ABC Road | Multan | Punjab | Pakistan
3 | Khan | 1987 | 3 | XYZ Road | Riyadh | SA | SA
How it is possible using zend? Thanks
The reference guide is the best starting point to learn about Zend_Db_Select. Along with my example below, of course:
//$db is an instance of Zend_Db_Adapter_Abstract
$select = $db->select();
$select->from(array('p' => 'person'), array('person_id', 'name', 'dob'))
->join(array('pa' => 'Person_Address'), 'pa.person_id = p.person_id', array())
->join(array('a' => 'Address'), 'a.address_id = pa.address_id', array('address_id', 'street', 'city', 'state', 'country'));
It's then as simple as this to fetch a row:
$db->fetchRow($select);
In debugging Zend_Db_Select there's a clever trick you can use - simply print the select object, which in turn invokes the toString method to produce SQl:
echo $select; //prints SQL
I'm not sure if you're looking for SQL to do the above, or code using Zend's facilities. Given the presence of "sql" and "joins" in the tags, here's the SQL you'd need:
SELECT p.person_id, p.name, p.dob, a.address_id, street, city, state, country
FROM person p
INNER JOIN Person_Address pa ON pa.person_id = p.person_id
INNER JOIN Address a ON a.address_id = pa.address_id
Bear in mind that the Person_Address tells us that there's a many-to-many relationship between a Person and an Address. Many Persons may share an Address, and a Person may have more than one address.
The SQL above will show ALL such relationships. So if Naveed has two Address records, you will have two rows in the result set with person_id = 1.