find the owner of EC2 instance by Athena and CloudTrail - sql
In order to know the owner of each EC2 instance, I query the cloudtrail logs stored in S3 by Athena.
I have a table in Athena with the following stucture:
CREATE EXTERNAL TABLE cloudtrail_logs (
eventversion STRING,
useridentity STRUCT<
type:STRING,
principalid:STRING,
arn:STRING,
accountid:STRING,
invokedby:STRING,
accesskeyid:STRING,
userName:STRING,
sessioncontext:STRUCT<
attributes:STRUCT<
mfaauthenticated:STRING,
creationdate:STRING>,
sessionissuer:STRUCT<
type:STRING,
principalId:STRING,
arn:STRING,
accountId:STRING,
userName:STRING>>>,
eventtime STRING,
eventsource STRING,
eventname STRING,
awsregion STRING,
sourceipaddress STRING,
useragent STRING,
errorcode STRING,
errormessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestid STRING,
eventid STRING,
resources ARRAY<STRUCT<
ARN:STRING,
accountId:STRING,
type:STRING>>,
eventtype STRING,
apiversion STRING,
readonly STRING,
recipientaccountid STRING,
serviceeventdetails STRING,
sharedeventid STRING,
vpcendpointid STRING
)
PARTITIONED BY (account string, region string, year string)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://<BUCKET>/AWSLogs/';
I want to find the identity of the user who launch an EC2 instances so I need to parse the field responseelements and only get the rows with responseelements that has a particular instanceID.
the field responseelements is like this:
{
"requestId":"cab34472-31cc-44cd-ae32-a84077e55cb6",
"reservationId":"r-05964c8549788ac50",
"ownerId":"xxxxxxxxxx",
"groupSet":{},
"instancesSet":{
"items":[
{"instanceId":"i-043543cb4c12",
"imageId":"ami-078df974",
"instanceState":{"code":0,"name":"pending"},
"privateDnsName":"ip-444444.eu-west-1.compute.internal",
"keyName":"key-dev","amiLaunchIndex":0,"productCodes":{},
"instanceType":"t2.large",
"launchTime":1488438050000,
"placement":{"availabilityZone":"eu-west-1b","tenancy":"default"},
"monitoring":{"state":"pending"},
"subnetId":"subnet-d8fffff",
"vpcId":"vpc-444435",
"privateIpAddress":"10.0.42.49",
"stateReason":{"code":"pending","message":"pending"},
"architecture":"x86_64",
"rootDeviceType":"ebs",
"rootDeviceName":"/dev/xvda",
"blockDeviceMapping":{},
"virtualizationType":"hvm",
"hypervisor":"xen",
"clientToken":"c6e53004-c561-437d-a642-196489ff297c_subnet-fffffffff",
"groupSet":{"items":[{"groupId":"sg-64878700","groupName":"MetamSecurityGroup"}]},
"sourceDestCheck":true,
"networkInterfaceSet":{
"items":[
{"networkInterfaceId":"eni-b16b66f0",
"subnetId":"subnet-dffffff",
"vpcId":"vpc-50fffff35",
"ownerId":"xxxxxxxx",
"status":"in-use",
"macAddress":"fdsfdsfsdfqdsf",
"privateIpAddress":"10.0.42.34234213",
"privateDnsName":"ip-1dddddd.eu-west-1.compute.internal",
"sourceDestCheck":true,
"groupSet":{"items":[{"groupId":"sg-64878700","groupName":"MetamSecurityGroup"}]},
"attachment":{"attachmentId":"eni-attach-45619121","deviceIndex":0,"status":"attaching","attachTime":1488438050000,"deleteOnTermination":true},
"privateIpAddressesSet":{"item":[{"privateIpAddress":"10ffffff","privateDnsName":"ip-ffffff.eu-west-1.compute.internal","primary":true}]},
"ipv6AddressesSet":{},
"tagSet":{}}]}
,"iamInstanceProfile":{"arn":"arn:aws:iam::xxxxx:instance-profile/infra-EC2InstanceProfile-1D59C5YR0LIYJ","id":"eeeeeeeeeeeeeeeeee"},
"ebsOptimized":false}
]
},
"requesterId":"226008221399"
}
This is my query that I tried:
SELECT DISTINCT eventsource, eventname, useridentity.userName, eventtime, json_extract(responseelements, '$.instanceId') as instance_id
FROM cloudtrail_logs
WHERE account = 'xxxxxxxxxxxxxxx'
AND eventname = 'RunInstances';
but this gives instance_id as an empty column.
How to properly get only instance_id from the resposneelement?
I found the right query to find the owner of an ECS instance. That might help someone!
SELECT DISTINCT eventsource, eventname, useridentity.userName, eventtime, json_extract(responseelements, '$.instancesSet.items[0].instanceId') as instance_id
FROM cloudtrail_logs
WHERE account = 'xxxxxxx'
AND eventname = 'RunInstances'
AND responseelements LIKE '%i-3434ecb4c12%'
;
Great answer!!! I was searching forever! Thank you! One small change. In my case account was not needed. It threw this error:
SYNTAX_ERROR: line 3:7: Column 'account' cannot be resolved
Here is how I run it:
SELECT DISTINCT eventsource,
eventname,
useridentity.userName,
eventtime,
json_extract(responseelements, '$.instancesSet.items[0].instanceId') as instance_id
FROM <myCloudLogTable>
WHERE eventname = 'RunInstances'
AND responseelements LIKE '<myinstanceId>';
Related
error in spark_sql with "value as is not a member of object"
I have the case class like: Hotel (id: String , name: String, cc1: String, city_preferred: String) the csv is like: id name cc1 city_hotel 1949417 apart A pl Sopot 2023862 apart B es Granada 1967734 apart C hr Ici and I have to show the id, name and country but only with the hotels in Spain I try: val hotelsSpain = Hotel.as("Hoteles") .select("cc1", "id", "name", "city_hotel") .where("cc1" == "es") display(hotelsSpain) but the code return me error: value as is not a member of object Hotel val hotelsSpain = Hotel.as("Hoteles") I think it is a little error but I don't see it. Thank you
In your schema, the column is defined as "city_prefered" (sic), but your code contains "city_hotel" Looks to me that this error is simply telling you that "city_hotel" doesn't exist on this object
How can I insert data into this structure in BigQuery?
In BigQuery, I want to insert some data into this very simple data structure: Field Type Mode id STRING NULLABLE policies RECORD REPEATED s RECORD NULLABLE something STRING NULLABLE riskTypes RECORD REPEATED code STRING NULLABLE In the light of my previous question I would expect the syntax to be as follows: UPDATE `tablename` SET policies = ARRAY_CONCAT( policies, [ struct<s struct<something STRING, riskTypes ARRAY<struct<code STRING>>> ("example something", [("example description")]) ] ) WHERE id = 'Moose'; But this gives an error: Unexpected "[" (before the "example description")
Below should work update `tablename` set policies = policies || [ struct<s struct<something string, riskTypes array<struct<code string>> >> (struct('example something' as something, [struct('example description 1' as code), struct('example description 2')])) ] where id = 'Moose'
my mybatis not work for Oracle but tested it work in SQLDeveloper
I'm new to Mybatis, I create a mapper to get the timestamp I want. but only one period of time it works, but later it just returns null. and for another function, I using SimpleDateFormat to format date into "dd-MMM-yyyy hh.mm.ss.SSS a", so I can insert it into timestamp(3) of the Oracle database. Below is my code, any suggestions? #Select("SELECT MAX(TIME_STARTED) FROM METADATA_SYNC_LOG WHERE STATUS = 'SUCCESSFUL'") Date getLastSyncStartedTime(); #Insert("INSERT INTO METADATA_SYNC_LOG (ID, SERVICE, HOST, TIME_STARTED, TIME_FINISHED, STATUS, SYNCRESULT) values (#{ID}, #{SERVICE}, #{HOST}, #{TIME_STARTED}, #{TIME_FINISHED}, #{STATUS}, #{SYNCRESULT})") void insertIntoMetadata_Sync_log(#Param("ID") String id, #Param("SERVICE") String service, #Param("HOST") String host, #Param("TIME_STARTED") String time_started, #Param("TIME_FINISHED") String time_finished, #Param("STATUS") String status, #Param("SYNCRESULT") String syncresult); #Update("MERGE INTO SERVICE_METADATA d USING (SELECT #{SERVICE} SERVICE, #{ID} ID, #{SERVICE_ENTITY} SERVICE_ENTITY, #{TIME_CREATED} TIME_CREATED, #{TIME_FINISHED} TIME_FINISHED, #{DETAILS} DETAILS FROM dual) s ON (d.ID = s.ID) WHEN MATCHED THEN UPDATE SET d.SERVICE = s.SERVICE, d.SERVICE_ENTITY = s.SERVICE_ENTITY, d.TIME_CREATED = s.TIME_CREATED, d.TIME_FINISHED = s.TIME_FINISHED, d.DETAILS = s.DETAILS WHEN NOT MATCHED THEN INSERT (SERVICE, ID, SERVICE_ENTITY, TIME_CREATED, TIME_FINISHED, DETAILS) VALUES (s.SERVICE, s.ID, s.SERVICE_ENTITY, s.TIME_CREATED, s.TIME_FINISHED, s.DETAILS)") void insertOrUpdateIntoService_Metadata(#Param("SERVICE") String service, #Param("ID") String id, #Param("SERVICE_ENTITY") String service_entity, #Param("TIME_CREATED") String time_created, #Param("TIME_FINISHED") String time_finished, #Param("DETAILS") String details);
micronaut-data and composite key mapping
I have an entity with a composite key #Entity data class Page( #EmbeddedId val pageId : PageId, ... ) #Embeddable data class PageId ( #Column(name = "id") val id: UUID, #Column(name = "is_published") val isPublished: Boolean ) But I need to respect the existing column names in the db table, which are 'id' and 'is_published' But querying the db with a JDBCRepository I get the error: SQL Error executing Query: ERROR: column page_.page_id_published does not exist Is there any way that I can map the columns correctly?
Try and error led me to the answer, somehow Micronaut does not like a Boolean to be named 'isPublished', when I rename it to 'published' it works fine: data class PageId ( #MappedProperty(value = "id") val id: UUID, #MappedProperty(value = "is_published") val published: Boolean)
Running Plain SQL dynamically in Quill using infix fails with wrong query syntax during runtime
I want to construct my query in plain SQL and then run it using Quill, I am using infix operator . My code is like this. case class Employee(name: String, age: String, company_name: String) case class Company(name: String, pin_code: String) case class CombinedEmployee(employee_age: Int, employee_name: Option[String], company_name: String, pin: Option[String]) val sql = "SELECT t1.age AS employee_age, t1.name AS employee_name, t2.name AS company_name, t2.pin as pin FROM employee t1 JOIN company t2 ON t1.company_name = t2.name" // it can be anything which is calculated dynamically. def rawQuery = quote { (myQuery: String) => infix"""$myQuery""" .as[Query[CombinedEmployee]] } and I am calling ctx.translate(rawQuery(lift(sql))) this prints SELECT x.employee_age, x.employee_name, x.company_name, x.pin FROM ('SELECT t1.age AS employee_age, t1.name AS employee_name, t2.name AS company_name, t2.pin as pin FROM employee t1 JOIN company t2 ON t1.company_name = t2.name') AS x which is a wrong syntax, when I executed this using run method this fails. Is the way of writing plain SQL correct ?? If no, is there a different way to run Plain SQL and model into case classes ??
This can be solved by adding a # before $ def rawQuery = quote { (myQuery: String) => infix"""#$myQuery""" .as[Query[CombinedEmployee]] }