Wednesday, June 9, 2021

SQL Workbench /J & Aurora & MariaDB driver getting "could not load system variables" - FIXED!

 Using the 2.7.3 MariaDB java connector with SQL Workbench /J, I was trying to reach my aurora instance but kept getting this error:

could not load system variables

  • https://community.dremio.com/t/mysql-source-creation-fails/6738/2
  • https://github.com/sysown/proxysql/issues/2523
  • https://github.com/sysown/proxysql/issues/2009
  • https://stackoverflow.com/questions/63283846/issue-integrating-mariadb-client-with-mysql-db
Finally found an answer here:
  • https://jira.mariadb.org/browse/CONJ-824
As per Diego Dupin:

Aurora proxy has a known race condition issue that results in skipping other queries in proxy buffer. connection might be in hang waiting for query response. During authentication, socket has a timeout, that will result throwing the error you describe.

So if `usePipelineAuth` or `useBatchMultiSend` is enable, it might work ... or not. All pipeline options (`usePipelineAuth` and `useBatchMultiSend`) must be disabled when using aurora. Problem has been reported to aurora a few times without correction.


What worked for us was adding this after the DB name 

usePipelineAuth=false&useBatchMultiSend=false

like this:

jdbc:mysql://127.0.0.1:3307/mydbname?usePipelineAuth=false&useBatchMultiSend=false

Friday, June 4, 2021

Powershell - using ODBC and the Simba Driver to query AWS Athena

 I was running into issues with the Linked Server lopping off long JSON that I'm having to pull out from the raw files.  I can't explain it - doesn't appear to be SSMS.  See previous post


But I needed to automate this, rather than use SQL Workbench, save to "Excel" (it was XML), then opening it again and saving it so that instead of 250mb, it's 30mb.  Runs against the previous month, one day at a time (walking the partitions), and then saves to a file.  You got your Athena, your ODBC, your Export-Excel...



#This has two requirements:

# 1 - simba installed as an ODBC driver, called "your_simba_athena_dsn_name_here".  Free from Amazon for this use.

# 2 - ImportExcel.  Requires Powershell V4 (V5?), but for sure v3 won't work.

#Get-ODBC-Data2 is just a slight modification of Anders' code that includes the UID/PWD. Thanks!

#https://www.andersrodland.com/working-with-odbc-connections-in-powershell/

#modified because we have to pass in user/password to Athena.

function Get-ODBC-Data2{

    param(

    [string]$dsn,

    [string]$param,

    [string]$query=$(throw 'query is required.')

    )

    $conn = New-Object System.Data.Odbc.OdbcConnection

    $conn.ConnectionString = "DSN=$dsn;$param"

    $query

    $conn.open()

    $cmd = New-object System.Data.Odbc.OdbcCommand($query,$conn)

    $ds = New-Object system.Data.DataSet

    (New-Object system.Data.odbc.odbcDataAdapter($cmd)).fill($ds) | out-null

    $conn.close()

    $ds.Tables[0]

 }


 #Set start and end.  Those can be whatever, but I made it one month for simplicity

 [datetime]$start_date_yyyymmdd = (get-date).addmonths(-1).tostring("yyyy-MM-01")

 [datetime]$end_date_yyyymmdd = $start_date_yyyymmdd.addmonths(1).tostring("yyyy-MM-01")

 #[datetime]$end_date_yyyymmdd = (get-date).addmonths(0).tostring("yyyy-MM-01")

 $filename = "Myfile_" + $start_date_yyyymmdd.tostring("yyyyMM")

 #set a variable so that as we walk through, we have an array to save everything to.

 $formattedresults=@()

 

 #loop through the days

 while ($start_date_yyyymmdd -lt $end_date_yyyymmdd) {

 $startdate = $start_date_yyyymmdd.tostring('yyyy-MM-dd')

 "starting $startdate"

 $query = @"

 select json_extract_scalar(myfield, '$.fieldjsonparsed') as fielda, dt,

 fieldb, fieldc

  from myschema.mytablename

  where dt='$startdate'

"@

 

 $results = get-odbc-data2 -query $query -dsn "your_simba_athena_dsn_name_here" -param "uid=youraccesskeyhere;PWD=yoursecretkeyhere" 

 #save non-systemdatarow version to our main array

 $formattedresults +=$results|select-object fielda,fieldb,fieldc 

  $start_date_yyyymmdd = $start_date_yyyymmdd.adddays(1)

 }

 

 #export using ImportExcel module, which unfortunately needs at least higher than V3

 $formattedresults| export-excel -AutoSize -BoldTopRow -FreezeTopRow -TableName Test -TableStyle "Light13" -WorkSheetname Test -Path c:\temp\$filename.xlsx


EXECUTE AT against Athena via Linked Server and "does not support the required transaction interface"

 TLDR - using EXECUTE AT with Athena via linked server.  I used https://www.mssqltips.com/sqlservertip/6191/query-aws-athena-data-from-sql-server to get it set up and it works!  But there are some issues.



1) Partition your tables in Athena.  Better performance if you have a decent amount of data

2) This means you can't just do "select from <linkedserver>...tablename", as that doesn't tell Athena to do partition elimination

3) Once you're using OPENQUERY, if you run into issues because it doesn't know what to expect, you can use CAST or CONVERT to force metadata (aka put "cast(fielda as varchar(500))" if it thinks the field is only 255 characters.

4) That doesn't help with over varchar(8000).  You need to move to EXECUTE AT instead, but that seems better then OPENQUERY as it appears to just pass the data back - I got rid of my CASTs in the athena query. Plus, dynamic SQL is easier! You will have to enable RPC OUT for the Linked Server.

5) I couldn't do INSERT INTO EXECUTE AT until I disabled Distributed Transactions in that Linked Server.



Update 2021/06/04 - and there's something screwy with EXECUTE AT.  It runs without the CAST statements and doesn't complain , but doesn't appear to save everything - I have a handful of rows (out of 100k, mind you) that don't match up - one of the fields is lopped off. It's the most annoying thing, since my other option is SQL Workbench which WORKS but isn't easily automated.  After futzing with it for a bit, I wound up moving over to a pure Powershell solution - I'll post that in a few.