Monday, May 7, 2018

[AWS] Reading Aurora audit logs in Cloudwatch for DDL changes

Another form of log-watching, this time the Audit logs that Aurora can push to Cloudwatch.  It's not all stuff by any means; heck, the example they use is just about failed logins.

In this case, we're reading the Cloudwatch logs for our RDS clusters.
Since Cloudwatch offers filters, in this example we're looking for QUERY commands, since we've turned Server Audit on, and are uploading connects + queries to Cloudwatch.  (More information on that in: https://aws.amazon.com/blogs/database/monitoring-amazon-aurora-audit-events-with-amazon-cloudwatch/).  Remember to set a cap on it, though then interesting things could be buried.

Why are we doing this?  In our case, this gives us a few bits of cool info.  Specifically, we can log all the DDL commands that come across our cluster, making sure everything is behaving as expected.


import-module awspowershell
# removing variables. Note that this will wipe out default region, credentials, etc, if before this line.
Remove-Variable -Name * -ErrorAction SilentlyContinue
Set-DefaultAWSRegion -Region us-west-1
$now = get-date
# For conversion from Unix Epoch
$origin = New-Object -Type DateTime -ArgumentList 1970, 1, 1, 0, 0, 0, 0
# here, we use implicit credentials, so there's no $credentials
set-awscredentials -accesskey youraccesskey -secretkey yoursecretkey
$myemail = "dev@null.com"
# Get the current timestamp
$Current_Unix_Timestamp = [Math]::Floor([decimal](Get-Date(Get-Date).ToUniversalTime()-uformat "%s")) * 1000
# Get the timestamp to start at.
$hours_to_look_back = 72
$Past_Unix_Timestamp = [Math]::Floor([decimal](Get-Date((Get-Date).AddHours(-1*$hours_to_look_back)).ToUniversalTime()-uformat "%s")) * 1000
# Get a list of all of our CloudWatch log groups
$All_CW_RDS_Logs = get-CWLLogGroup -LogGroupNamePrefix "/aws/rds" | where {$_.LogGroupName -like "*/myprefix-*"}
foreach ($Specific_CW_Log_Group in $All_CW_RDS_Logs) {
write-host $Specific_CW_Log_Group.LogGroupName
$CW_NextToken = $null # reset for each log group. Required for NextToken to work
#Using $null for NextToken means we can use the same pattern as for regular logs
#NOTE: this hangs if the FilterPattern is invalid. Which apparently includes commas, backslashes, etc.
#$CW_Results =
DO {
#write-host "CWToken $CW_NextToken"
$CW_RDS +=
Get-CWLFilteredLogEvent `
-LogGroupName $Specific_CW_Log_Group.LogGroupName `
-StartTime $Past_Unix_Timestamp `
-EndTime $Current_Unix_Timestamp `
-FilterPattern "QUERY" `
-Limit 2000 `
-NextToken $CW_NextToken
#FilterPattern can't use commas. ",QUERY," should show all create/truncate/drop, but we use QUERY instead
#unlike the regular logs, this one returns a normal powershell dataset - no need to parse it out, just query events
# $CW_RDS.Events # moved to outside the loop.
$CW_NextToken = $CW_RDS.NextToken
}WHILE ($CW_NextToken -ne $null)
}
# Regex-based filtering
$Log_Filters = "temporary table","truncate "
#We use this line to take our list of log filters and generate a regex that is then compared against
# https://blogs.technet.microsoft.com/heyscriptingguy/2011/02/18/speed-up-array-comparisons-in-powershell-with-a-runtime-regex/
# the replace is needed because I want an exact match of those commas, and the escape($_) does weird stuff to it
# 2018/05/09 removed [regex] at beginning as per https://stackoverflow.com/questions/50261383/powershell-3-regex-why-is-case-insensitive-not-working#50261506
$Log_Filter_regex = (($Log_Filters|foreach {[regex]::escape($_) -replace ",","\,"}) –join "|")
# Applying the regex filter
$final_log = $CW_RDS.Events| where {$_.Message -inotmatch $Log_Filter_regex} `
| select @{Label='TimeStampHR';Expression={$origin.AddSeconds($_.TimeStamp/1000).ToLocalTime()}}, LogStreamName, message
# Making an HTML table
# https://jamesdatatechq.wordpress.com/2014/12/23/how-to-create-an-html-table-from-powershell/
$Style = "
<style>
TABLE{border-width: 1px;border-style: solid;border-color: black;border-collapse: collapse;}
TH{border-width: 1px;padding: 3px;border-style: solid;border-color: black;}
TD{border-width: 1px;padding: 3px;border-style: solid;border-color: black;}
</style>
"
$error_table = $final_log| ConvertTo-HTML -AS Table -Fragment -PreContent "<h2>Cloudwatch Alerts as of $now</h2>"|Out-String
#Save the HTML Web Page as a string so we can email it
[string]$email_table = (ConvertTo-HTML -head $Style -PostContent $error_table)
# Now send the mail
if ($final_log){
#Send email to $myemail. SMTP is a CNAME to the corporate SMTP server we use.
Send-MailMessage -smtpserver smtp.mycompany.com `
-To $myemail `
-From "Do_Not_Reply_PoSH <dev@null.com>" `
-Subject "[AWS] Cloudwatch errors in Prod as of $now" `
-body $email_table -BodyAsHtml
}

[AWS] Aurora - reading the errorlog files with powershell

Been working on monitoring.  For some reason, when you tell Aurora to send errorlogs to Cloudwatch, all it sends are the Audit Logs, which will tell you that code had changed, etc, but doesn't (!?!?!??!!!) actually put your logs in Cloudwatch.  I don't understand it, so I built this process to look through logs and return the data.  The next step would be to format it and either upload to Cloudwatch manually, or log it, or send email.  Whatever works for you.  You be you.  :D


###################################################
## Reading error logs for Aurora with Powershell ##
###################################################
# 1.01 - MDB 2018/05/30 Better trapping of time; there's a lot of rows that have no datetime that I want. Also added HTML email.
# 1.02 - mdb 2018/05/30 also dealing better with the regex filter. Recommend you convert it to a NOTMATCH; too many different error types
clear
Remove-Variable -Name * -ErrorAction SilentlyContinue
$myemail="dev@null.com"
# Note that this one does NOT read the audit log, which means we can filter the log based on a normal timestamp.
# But it does filter messages added after the time given.
import-module awspowershell
$credentials = get-awscredentials -ProfileName mysavedprofilename
Set-DefaultAWSRegion -Region us-west-1
$now = (get-date).ToString('yyyy/MM/dd HH:mm:ss')
$final_log = @() #setting the array to prevent type errors when adding to the array
# Figure out what files we need to look at, since they have a "last written to" field. Despite the FLOOR, the Past_Unix_Timestamp is X hours ago, down to the second, from when it is run.
$hours_to_look_back = 1.25
$Past_Unix_Timestamp = [Math]::Floor([decimal](Get-Date((Get-Date).AddHours(-1*$hours_to_look_back)).ToUniversalTime()-uformat "%s")) * 1000
$Past_Timestamp = (Get-Date((Get-Date).AddHours(-1*$hours_to_look_back)).ToUniversalTime())
#See far below for more on filters; be careful on what you filter. I'd probably use a NotMatch to ignore certain things
$Log_Filters = "[ERROR]","failed","Note","[Warning]"
#We use this line to take our list of log filters and generate a regex that is then compared against
# https://blogs.technet.microsoft.com/heyscriptingguy/2011/02/18/speed-up-array-comparisons-in-powershell-with-a-runtime-regex/
# [regex] $a_regex = ‘(?i)^(‘ + (($a |foreach {[regex]::escape($_)}) –join “|”) + ‘)$’
# the replace is needed because I want an exact match of those commas, and the escape($_) does weird stuff to it
# 2018/05/09 removed [regex] at beginning as per https://stackoverflow.com/questions/50261383/powershell-3-regex-why-is-case-insensitive-not-working#50261506
$Log_Filter_regex = (($Log_Filters|foreach {[regex]::escape($_) -replace ",","\,"}) –join "|")
#removed this from the WHERE line below, though useful depending on what you're looking for: -and ($_.LogFileName -notlike '*audit*')
get-rdsdbinstance -Credential $credentials | `
where {$_.DBClusterIdentifier -like '*'}|% {
$instancename = $_.DBInstanceIdentifier
$instancename
get-rdsdblogfile -dbinstanceidentifier $instancename -Credential $credentials | `
where {($_.LastWritten -gt $Past_Unix_Timestamp) -and ($_.LogFileName -notlike '*slow*') `
-and ($_.LogFileName -notlike 'audit/*')
} |
% {
$_.LogFileName
$LogFileName = $_.LogFileName
$Marker = 0
DO
{
$AWSLogFile = Get-RDSDBLogFilePortion -DBInstanceIdentifier $instancename -LogFileName $_.LogFileName `
-Credential $credentials -Marker $Marker -NumberOfLines 2000
#$AWSLogFile
if ($AWSLogFile.LogFileData)
{
$parsed_log = $AWSLogFile.LogFileData.split("`n`r") #splitting it to multiple lines using the CRLF in the file
#Here, I try and parse the first 19 characters, which in the normal log, should be a timedatestamp.
# If it's either parseable and recent, or unparseable, I then filter and toss into a master array of data.
#https://social.technet.microsoft.com/Forums/ie/en-US/0a7cbba0-87fb-450d-b8e8-29aed0947fd7/cant-get-datetimetryparseexact-to-work-using-powershell?forum=winserverpowershell
#https://stackoverflow.com/questions/2988880/extricate-a-substring-using-powershell
#https://blogs.technet.microsoft.com/heyscriptingguy/2014/12/19/powertip-convert-string-into-datetime-object/
#parsedDat is what we get back from TryParseExact - either a valid time or a big nothing.
$parsedDat=0
$dateformat = 'yyyy-MM-dd HH:mm:ss'
$final_log += foreach($line in $parsed_log){
#reset this each time so we don't get bad data
$attemptedtime = $null
#The actual substring, but unlike substring it won't throw an error if the line is empty or too short.
[string]$attemptedtime = $line[0..18] -join ''
#Trying to convert it to an actual date.
$linetime = if([DateTime]::TryParseExact($attemptedtime, $dateformat, $null,'None',[ref]$parsedDat)){
$parseddat
}else{
get-date #this way, anything that can't be parsed will be within our window
}
#Now we get recent lines, run it through the REGEX filter, and save it. Use {1 -eq 1} to return all.
# There are a LOT of messages that don't have the right tags, so you might leave it off.
if ($linetime -gt $Past_Timestamp)
{
$line | where {$_ -imatch $Log_Filter_regex}| `
select @{LABEL="Instance"; EXPRESSION = {$instancename}}, `
@{LABEL="LogFileName"; EXPRESSION = {$LogFileName}}, `
@{LABEL="Message"; EXPRESSION = {$_}}
}
}
}
$Marker = $AWSLogFile.Marker
}WHILE ($AWSLogFile.AdditionalDataPending -eq $true)
}
}
$final_log #you still need to get it into a format you can send; I'm using an HTML table
#echoing it out here so that the SQLAgent log has a record of what happened, if everything worked.
# Making an HTML table
# https://jamesdatatechq.wordpress.com/2014/12/23/how-to-create-an-html-table-from-powershell/
$Style = "
<style>
TABLE{border-width: 1px;border-style: solid;border-color: black;border-collapse: collapse;}
TH{border-width: 1px;padding: 3px;border-style: solid;border-color: black;}
TD{border-width: 1px;padding: 3px;border-style: solid;border-color: black;}
</style>
"
$error_table = $final_log| ConvertTo-HTML -AS Table -Fragment -PreContent "<h2>ErrorLog messages as of $now, starting at $Past_TimeStamp </h2>"|Out-String
#Save the HTML Web Page
[string]$email_table = (ConvertTo-HTML -head $Style -PostContent $error_table)
# Now send the mail
if ($final_log){
#Send email to $myemail. SMTP is a CNAME to the corporate SMTP server we use.
Send-MailMessage -smtpserver smtp.mycompany.com `
-To $myemail `
-From "Do_Not_Reply_PoSH <do_not_reply@dev.null>" `
-Subject "[AWS] RDSDB Errors in Prod as of $now" `
-body $string_log
"mail sent"
}

Thursday, May 3, 2018

[AWS] powershell to patch all Aurora clusters

Pretty basic, but took longer than I figured it would.  The catch was figuring out how to look inside the results.


set-awscredentials -accesskey youraccesskey -secretkey yoursecretkey

Get-RDSPendingMaintenanceAction|%{
Submit-RDSPendingMaintenanceAction -ResourceIdentifier $_.ResourceIdentifier -applyaction $_.PendingMaintenanceActionDetails.action -OptInType immediate }


So when you get the results back, it looks like:

PendingMaintenanceActionDetails               ResourceIdentifier                                         
-------------------------------               ------------------                                         
{Amazon.RDS.Model.PendingMaintenanceAction}   arn:aws:rds:us-west-1:xxxxxxxx:cluster:xxxxxx

How do you view what's in that Amazon.RDS object?  I have no doubt there's some way to unpack it with powershell, but I'm not sure what that is.

What I did:

Looked at the PoSH module documentation for this cmdlet (Get-RDSPendingMaintenanceAction) to see what it returned:
https://docs.aws.amazon.com/powershell/latest/reference/Index.html

Which says:
PendingMaintenanceActionDetails
ResourceIdentifier

Which, indeed, is what it returned to us.

Now, clicking on the object info from the documentation:
System.Collections.Generic.List<Amazon.RDS.Model.PendingMaintenanceAction>

takes us to:
https://docs.aws.amazon.com/sdkfornet/v3/apidocs/index.html?page=RDS/TRDSResourcePendingMaintenanceActions.html&tocid=Amazon_RDS_Model_ResourcePendingMaintenanceActions)

And THAT page says it has Action, AutoAppliedAfterDate, etc.
When I run

$DataSet = Get-RDSPendingMaintenanceAction
$DataSet.PendingMaintenanceActionDetails

Here's what I get:

Action               : system-update
AutoAppliedAfterDate : 1/1/0001 12:00:00 AM
CurrentApplyDate     : 5/2/2018 4:41:00 PM
Description          : Aurora 1.17.2 release
ForcedApplyDate      : 1/1/0001 12:00:00 AM
OptInStatus          : immediate


So, now we have the fields we need: what kind of action to take (non optional, and it can be db-update or system-update), and the ResourceIdentifier for the cluster.