fbpx

Automate Amazon Athena queries for PCI DSS log evaluation using AWS Lambda

In this post, I will show you how exactly to make use of AWS Lambda to automate PCI DSS (v3.2.1) evidence era, and daily log evaluation to assist together with your continuous PCI DSS routines. We shall specifically be considering AWS CloudTrail Logs kept centrally in Amazon Basic Storage Services (Amazon S3) (that is also a Well-Architected Protection Pillar best exercise) and make use of Amazon Athena to query.

This post assumes knowledge of developing a database in Athena. If you’re not used to Athena, please have a consider the Athena starting out guide and develop a data source before continuing. Observe the bucket selected for the result of Athena query outcomes, we shall use it in this article later.

In this article, we walk through:

  • Creating the partitioned table for the AWS CloudTrail logs. To be able to reduce time and expenses to query outcomes in Athena, we’ll demonstrate how to partition your computer data. If you’re unfamiliar with partitioning already, you can find out about it in the Athena user guideline.
  • Constructing SQL queries to find PCI DSS audit log proof. The SQL queries which are provided in this article are linked to PCI DSS requirement 10 straight. Customizing these queries to meet up your responsibilities might be able to help you in finding your way through a PCI DSS evaluation.
  • Creating a good AWS Lambda function in order to automate daily operating these SQL queries, in order to assist deal with the PCI DSS day-to-day log review requirement 10.6.1.

partition and

Create a table

The next code shall create and partition a table for CloudTrail logs. Before you execute this query, make sure to replace the adjustable placeholders with the provided information from your own database. They are:

  • – the real name of one’s Athena table
  • Area – the road to your CloudTrail logs in Amazon S3. A good example is integrated in the next code. It offers the variable placeholders:
    • – your AWS account quantity. If making use of organizational CloudTrail, utilize the following format through the entire post because of this variable: o-/
    • – the bucket name where in fact the CloudTrail logs to end up being queried reside

CREATE Exterior TABLE  (
    eventVersion STRING,
    userIdentity STRUCT<
        type: STRING,
        principalId: STRING,
        arn: STRING,
        accountId: STRING,
        invokedBy: STRING,
        accessKeyId: STRING,
        userName: STRING,
        sessionContext: STRUCT<
            attributes: STRUCT< mfaAuthenticated: STRING, creationDate: STRING>,
            sessionIssuer: STRUCT< type: STRING, principalId: STRING, arn: STRING, accountId: STRING, userName: STRING>>>,
    eventTime STRING,
    eventSource STRING,
    eventName STRING,
    awsRegion STRING,
    sourceIpAddress STRING,
    userAgent STRING,
    errorCode STRING,
    errorMessage STRING,
    requestParameters STRING,
    responseElements STRING,
    additionalEventData STRING,
    requestId STRING,
    eventId STRING,
    resources ARRAY<STRUCT< arn: STRING, accountId: STRING, kind: STRING>>,
    eventType STRING,
    apiVersion STRING,
    readOnly STRING,
    recipientAccountId STRING,
    serviceEventDetails STRING,
    sharedEventID STRING,
    vpcEndpointId STRING
)
COMMENT 'CloudTrail table'
PARTITIONED BY(area string, year string, 30 days string, day string)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED Like INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
Place 's3:///AWSLogs//CloudTrail/'
TBLPROPERTIES ('classification'='cloudtrail');

Execute the query. You need to visit a message stating Query successful.

Amount 1: Query profitable

Figure 1: Query prosperous

The preceding query creates a CloudTrail table and defines the partitions in your Athena data source. Before starting running queries to create evidence, you shall have to run alter table commands to finalize the partitioning.

Make sure to update the next variable placeholders together with your information:

  • – the real name of one’s Athena database

Provide values for the next variables:

  • area – area of the logs to partition
  • 30 days – month of the logs to partition
  • time – day time of the logs to partition
  • calendar year – 12 months of the logs to partition
  • Area – the road to your CloudTrail logs in Amazon S3 to partition, right down to the specific time (should complement the preceding ideals of region, month, day time, and year). It offers the variable placeholders:

ALTER Desk .  ADD partition  (area='us-east-1', month='02', time='28', year='2020') place 's3:///AWSLogs//CloudTrail/us-east-1/2020/02/28/';

Following the partition has been configured, it is possible to query logs from the spot and date that has been partitioned. Here’s a good example for PCI DSS necessity 10.2.4 (all relevant PCI DSS requirements are usually described later in this article).


SELECT * FROM . WHERE eventname = 'ConsoleLogin' AND responseelements LIKE '%Failing%' AND area= 'us-east-1' AND year='2020' AND 30 days='02' AND day='28';

Create a Lambda perform to save time

As you can plainly see, this technique above can involve plenty of manual methods as you setup partitioning for every region and query for every day or region. Allow’s simplify the procedure by putting these right into a Lambda function.

Use the Lambda gaming console to produce a function

To generate the Lambda function:

  1. Open up the Lambda gaming console and choose Make function, and select the choice to Author from scratch.
  2. Enter Athena_log_query because the function name, and choose Python 3.8 because the runtime.
  3. Under Choose or even create an execution function, select Create new part with simple Lambda permissions.
  4. Choose Create function.
  5. As soon as the function is established, choose the Permissions tab near the top of the page and choose the Execution function to see in the IAM gaming console. It shall look like the following figure.
    Amount 2: Permissions tabNumber 2: Permissions tab

Update the IAM Function to permit Lambda permissions to appropriate services

  1. Inside the IAM console, choose the policy title. Choose Edit plan, then choose the JSON tab and paste the next code in to the window, replacing the next variable and placeholders:
    • us-east-1 – This is actually the region where sources are. Change only when necessary.
    • – bucket title you thought we would store the query outcomes when establishing Athena.
    
    
        "Version": "2012-10-17",
        "Statement": [
            
                "Effect": "Allow",
                "Action": [
                    "glue:UpdateDatabase",
                    "glue:BatchCreatePartition",
                    "glue:GetDatabase",
                    "athena:StartQueryExecution",
                    "glue:GetPartitions",
                    "glue:UpdateTable",
                    "s3:CreateBucket",
                    "s3:ListBucket",
                    "glue:GetTable",
                    "s3:ListMultipartUploadParts",
                    "s3:PutObject",
                    "s3:GetObjectAcl",
                    "s3:GetObject",
                    "athena:CancelQueryExecution",
                    "athena:StopQueryExecution",
                    "athena:GetQueryExecution",
                    "s3:GetBucketLocation",
                    "glue:UpdatePartition"
                ],
                "Resource": [
                    "arn:aws:glue:us-east-1::catalog",
                    "arn:aws:glue:us-east-1::desk//",
                    "arn:aws:glue:us-east-1::data source/mydatabase",
                    "arn:aws:s3:::/*",
                    "arn:aws:s3:::",
                                    "arn:aws:s3:::/*",
                                    "arn:aws:s3:::",
                    "arn:aws:athena:us-east-1::workgroup/major"
                ]
            ,
            
                "Effect": "Allow",
                "Action": [
                    "logs:PutLogEvents",
                    "logs:CreateLogGroup",
                    "logs:CreateLogStream"
                ],
                "Reference": "arn:aws:logs:us-east-1::*"
            
        ]
    
    

    Note: According to the atmosphere, this policy may not be restrictive enough and really should be limited by only users needing usage of the cardholder data atmosphere and audit logs. More info about restricting IAM plans are available in IAM JSON Policy Elements: Condition Operators.

  2. Choose Evaluation policy and Save changes.

Personalize the Lambda Perform

  1. On the Lambda dashboard, pick the Configuration tab. In Basic configurations, increase the functionality timeout to 5 mins to make sure that the event always has time and energy to surface finish working your queries, and select Save then. Best Practices for Developing on AWS Lambda has more strategies for using Lambda.
  2. Paste the next code into the functionality editor upon the Construction tab, changing the prevailing text. The program code includes eight instance queries to run and will be customized as required.
    The initial query will add partitions to your Amazon S3 logs so the following seven queries will run quickly and become cost effective.This code combines the partitioning, and example Athena queries to aid in meeting PCI DSS logging requirements, which is explained more below:Replace these values inside the program code that follows:

    • REGION1 – first area to partition
    • REGION2 – second area to partition*
    
    import boto3
    import datetime
    import time
    
    #EDIT THE FOLLOWING#
    #----------------------#
    
    #This ought to be the name of one's Athena database
    ATHENA_Data source = ""
    
    #This ought to be the name of one's Athena database table
    ATHENA_TABLE = ""
    
    #This may be the Amazon S3 bucket title you need partitioned and logs queried from:
    LOG_BUCKET = ""
    
    #AWS Account amount for the Amazon S3 way to your CloudTrail logs
    AWS_Accounts_ID = ""
    
    #This may be the Amazon S3 bucket title for the Athena Query outcomes:
    Result_LOG_BUCKET = ""
    
    #Define regions to partition
    REGION1 = "us-east-1"
    REGION2 = "us-west-2"
    #----------------------#
    #STOP EDITING#
    
    RETRY_COUNT = 50
    
    #Getting the current day and splitting into variables to utilize in queries below
    CURRENT_Day = datetime.datetime.today()
    DATEFORMATTED = (CURRENT_Time.isoformat())
    ATHENA_Yr = str(DATEFORMATTED[:4])
    ATHENA_30 days = str(DATEFORMATTED[5:7])
    ATHENA_Day time = str(DATEFORMATTED[8:10])
    
    #area for the Athena query results
    OUTPUT_LOCATION = "s3://"+Result_LOG_BUCKET+"/DailyAthenaLogs/CloudTrail/"+str(CURRENT_DATE.isoformat())
    
    #Athena Query definitions for PCI DSS requirements
    YEAR_MONTH_Time = f'year='ATHENA_YEAR' AND month='ATHENA_MONTH' AND day='ATHENA_DAY';'
    ATHENA_DB_TABLE = f'ATHENA_DATABASE.ATHENA_TABLE'
    PARTITION_Declaration_1 = f'partition (region="REGION1", month="ATHENA_MONTH", day="ATHENA_DAY", year="ATHENA_YEAR")'
    LOCATION_1 = f' location "s3://LOG_BUCKET/AWSLogs/AWS_ACCOUNT_ID/CloudTrail/REGION1/ATHENA_YEAR/ATHENA_MONTH/ATHENA_DAY/"'
    PARTITION_Declaration_2 = f'partition (region="REGION2", month="ATHENA_MONTH", day="ATHENA_DAY", year="ATHENA_YEAR")'
    LOCATION_2 = f' place "s3://LOG_BUCKET/AWSLogs/AWS_ACCOUNT_ID/CloudTrail/REGION2/ATHENA_YEAR/ATHENA_MONTH/ATHENA_DAY/"'
    SELECT_Declaration = "SELECT * FROM "+ATHENA_DB_TABLE+ " WHERE "
    LIKE_BUCKET = f' '%LOG_BUCKET%''
    
    
    #Query to partition decided on regions
    QUERY_1 = f'ALTER TABLE ATHENA_DB_TABLE ADD OR EVEN EXISTS PARTITION_STATEMENT_1 LOCATION_1 PARTITION_STATEMENT_2 LOCATION_2'
    
    #Accessibility to audit trails or even CHD 10.2.1/10.2.3
    QUERY_2 = f'SELECT_STATEMENT requestparameters LIKE LIKE_BUCKET AND sourceipaddress <> 'cloudtrail.amazonaws.com' AND sourceipaddress <> 'athena.amazonaws.com' AND eventName = 'GetObject' AND YEAR_MONTH_DAY'
    
    #Root Activities PCI DSS 10.2.2
    QUERY_3 = f'Choose_STATEMENT userIdentity.sessionContext.sessionIssuer.userName Want '%root%' AND YEAR_MONTH_DAY'
    
    #Failed Logons PCI DSS 10.2.4
    QUERY_4 = f'SELECT_STATEMENT eventname = 'ConsoleLogin' AND responseelements LIKE '%Failing%' AND YEAR_MONTH_DAY'
    
    #Privilege adjustments PCI DSS 10.2.5.b, 10.2.5.c
    QUERY_5 = f'Choose_STATEMENT eventname LIKE '%AddUserToGroup%' AND requestparameters LIKE '%Admin%' AND YEAR_MONTH_DAY'
    
    # Initialization, stopping, or pausing of the audit logs PCI DSS 10.2.6
    QUERY_6 = f'SELECT_STATEMENT eventname = 'StopLogging' OR eventname = 'StartLogging' AND YEAR_MONTH_DAY'
    
    #Suspicious activity PCI DSS 10.6
    QUERY_7 = f'Choose_STATEMENT eventname LIKE '%DeleteSecurityGroup%' OR eventname LIKE '%CreateSecurityGroup%' OR eventname LIKE '%UpdateSecurityGroup%' OR eventname LIKE '%AuthorizeSecurityGroup%' AND YEAR_MONTH_DAY'
    
    QUERY_8 = f'Choose_STATEMENT eventname LIKE '%Subnet%' and eventname NOTHING LIKE 'Describe%' AND YEAR_MONTH_DAY'
    
    #Defining function to create query status for every query
    def query_stat_enjoyable(query, response):
        client = boto3.customer('athena')
        query_execution_id = reaction['QueryExecutionId']
        printing(query_execution_id +' : '+query)
        for i in variety(1, 1 + RETRY_COUNT):
            query_status = customer.get_query_execution(QueryExecutionId=query_execution_id)
            query_fail_standing = query_status['QueryExecution']['Position']
            query_execution_position = query_fail_status['Condition']
    
            if query_execution_standing == 'SUCCEEDED':
                print("Standing:" + query_execution_status)
                break
    
            if query_execution_position == 'FAILED':
                print(query_fail_status)
    
            else:
                print("Position:" + query_execution_status)
                time.sleep(i)
        else:
            client.stop_query_execution(QueryExecutionId=query_execution_id)
            boost Exception('Maximum Retries Exceeded')
    
    def lambda_handler(query, context):
        client = boto3.customer('athena')
        queries = [QUERY_1, QUERY_2, QUERY_3, QUERY_4, QUERY_5, QUERY_6, QUERY_7, QUERY_8]
        for query in queries:
            response = client.begin_query_execution(
                QueryString=query,
                QueryExecutionContext=
                    'Database': ATHENA_DATABASE ,
                ResultConfiguration=
                    'OutputLocation': OUTPUT_LOCATION )
            query_stat_enjoyable(query, response)
    

    Note: More regions could be added for those who have additional areas to partition. The ADD partition statement could be copied and pasted to include additional areas as needed. Additionally, it is possible to hard code the areas for your environment in to the statements.

  3. Choose Save inside the very best right.

Athena Queries used to get evidence

The queries used to assemble evidence for PCI DSS are divided from the Lambda function we created, utilizing the partitioned time example from above. They’re listed making use of their respective requirement.

Note: AWS owns the protection OF the cloud, providing higher levels of security inside alignment with this numerous compliance applications. The customer is in charge of the security of these assets IN the cloud, maintaining its content compliant plus secure. The queries here are meant to be considered a evidence of concept and really should be customized to your environment.

10.2.1/10.2.3 – Put into action automated audit trails for several system parts to reconstruct usage of either or both cardholder information and audit trails:


"SELECT * FROM . WHERE requestparameters Want '%%' AND sourceipaddress <> 'cloudtrail.amazonaws.com' AND sourceipaddress <>  'athena.amazonaws.com' AND eventName = 'GetObject' AND year='2020' AND 30 days='02' AND day='28';"

10.2.2 – Carry out automated audit trails for several system elements to reconstruct all activities taken by anyone making use of root or administrative privileges.


"SELECT * FROM . WHERE userIdentity.sessionContext.sessionIssuer.userName Want '%root%' AND year='2020' AND 30 days='02' AND day='28';"

10.2.4 – Employ automated audit trails for several operational system parts to reconstruct invalid logical accessibility attempts.


"SELECT * FROM . WHERE eventname = 'ConsoleLogin' AND responseelements LIKE '%Failing%' AND yr='2020' AND month='02' AND day='28';"

10.2.5.b – Verify all elevation of privileges will be logged.

10.2.5.c – Verify all noticeable changes, additions, or deletions to any accounts with root or administrative privileges are usually logged:


"SELECT * FROM . WHERE eventname Want '%AddUserToGroup%' AND requestparameters Want '%Admin%' AND year='2020' AND 30 days='02' AND day='28';"

10.2.6 – Put into action automated audit trails for several system elements to reconstruct the initialization, stopping, or pausing of the audit logs:


"SELECT * FROM . WHERE eventname = 'StopLogging' OR eventname = 'StartLogging' AND year='2020' AND 30 days='02' AND day='28';"

10.6 – Evaluation logs and security activities for several system components to recognize anomalies or suspicious action:


"SELECT * FROM . WHERE eventname Want '%DeleteSecurityGroup%' OR eventname Want '%CreateSecurityGroup%' OR eventname Want '%UpdateSecurityGroup%' OR eventname Want '%AuthorizeSecurityGroup%' AND year='2020' AND 30 days='02' AND day='28';" 

"SELECT * FROM . WHERE eventname Want '%Subnet%' and eventname NOTHING LIKE 'Describe%' AND year='2020' AND 30 days='02' AND day='28';" 

You may use the AWS Command Line Interface (AWS CLI) to invoke the Lambda functionality utilizing the following command, replacing with the title of the Lambda functionality you created:


aws lambda invoke --function-name  outfile

The AWS Lambda API Reference has more info on making use of Lambda with AWS CLI.

Note: the outcomes from the functionality will be situated in the OUTPUT_Place variable within the Lambda functionality.

Make use of Amazon CloudWatch to perform your Lambda functionality

A rule could be developed by you in CloudWatch to possess this function work automatically on a collection schedule.

Create the CloudWatch rule

    1. From the CloudWatch dashboard, under Events, select Rules, Create rule then.
    2. Under Event Source, choose the radio key for Schedule and select a fixed rate or even enter in a custom made cron expression.
    3. Finally, within the Targets section, choose Lambda function and discover your Lambda function from the drop lower.
      The example screenshot shows a CloudWatch rule configured to invoke the Lambda function every day:

      Amount 3: CloudWatch principle

      Number 3: CloudWatch principle

the plan is configured

  1. As soon as, choose Configure information to move to another screen.
  2. Enter a genuine name for the rule, be sure that Condition is allowed, and choose Develop rule.

Check your function is operating

It is possible to navigate to your Lambda function&rsquo then;s CloudWatch log team to notice if the event is running like intended.

To locate the correct CloudWatch team, from your Lambda functionality within the console, choose the Supervising tab, select View logs inside CloudWatch then.

Figure 4: Watch logs inside CloudWatch Figure 4: Watch logs in CloudWatch

It is possible to take this a action further and create an SNS notification to e-mail you once the function is triggered.

Summary

In this article, we walked through partitioning an Athena table, which assists inside reducing cost and time when working queries on your own S3 buckets. We constructed illustration SQL queries linked to PCI DSS necessity 10 then, to aid in audit preparation. Lastly, we developed a Lambda functionality to automate running everyday queries to draw PCI DSS audit log proof from Amazon S3, to aid with the PCI DSS day-to-day log review requirement. You’re motivated by me to customize, add, or take away the SQL queries to best match your compliance and requirements requirements.

For those who have feedback concerning this post, submit remarks in the Comments section below.

Want a lot more AWS Security how-to articles, news, and show announcements? Stick to us on Twitter.

Writer

Logan Culotta

Logan Culotta is really a Safety Assurance Consultant, and an ongoing Qualified Protection Assessor (QSA). Logan will be area of the AWS Safety Assurance team, that is also a professional Security Assessor Business (QSAC). He likes finding methods to automate safety and compliance in the AWS cloud. In his leisure time, he can be discovered by you hanging out with his family, street cycling, or cooking.