AWS RDS Postgres password-less authentication in Canton

Thank you, i’ll try this and get back.

@Ratko_Veprek - Sorry for the delay, we finally got to test this, but it needed few more changes to get it working than what was suggested above. Especially in the below file,

+++ b/community/ledger/ledger-api-core/src/main/scala/com/digitalasset/canton/platform/store/backend/postgresql/PostgresDataSourceStorageBackend.scala
@@ -6,7 +6,7 @@ package com.digitalasset.canton.platform.store.backend.postgresql
 import anorm.SqlParser.get
 import anorm.SqlStringInterpolation
 import com.daml.resources.ProgramResource.StartupException
-import com.digitalasset.canton.logging.{NamedLoggerFactory, NamedLogging}
+import com.digitalasset.canton.logging.{NamedLoggerFactory, NamedLogging, TracedLogger}
 import com.digitalasset.canton.platform.store.backend.DataSourceStorageBackend
 import com.digitalasset.canton.platform.store.backend.common.{
   DataSourceStorageBackendImpl,
@@ -14,7 +14,7 @@ import com.digitalasset.canton.platform.store.backend.common.{
 }
 import com.digitalasset.canton.platform.store.backend.postgresql.PostgresDataSourceConfig.SynchronousCommitValue
 import com.digitalasset.canton.tracing.TraceContext
-import org.postgresql.ds.PGSimpleDataSource
+import com.zaxxer.hikari.HikariDataSource
 
 import java.sql.Connection
 import javax.sql.DataSource
@@ -25,6 +25,7 @@ final case class PostgresDataSourceConfig(
     tcpKeepalivesIdle: Option[Int] = Some(10), // corresponds to: tcp_keepalives_idle
     tcpKeepalivesInterval: Option[Int] = Some(1), // corresponds to: tcp_keepalives_interval
     tcpKeepalivesCount: Option[Int] = Some(5), // corresponds to: tcp_keepalives_count
+    driverClassName: Option[String] = None,
 )
 
 object PostgresDataSourceConfig {
@@ -57,8 +58,15 @@ class PostgresDataSourceStorageBackend(
       connectionInitHook: Option[Connection => Unit],
   ): DataSource = {
     import DataSourceStorageBackendImpl.exe
-    val pgSimpleDataSource = new PGSimpleDataSource()
-    pgSimpleDataSource.setUrl(dataSourceConfig.jdbcUrl)
+    implicit val traceContext: TraceContext = TraceContext.empty
+    val logger = TracedLogger(loggerFactory.getLogger(getClass))
+    val hikariDataSource = new HikariDataSource()
+    hikariDataSource.setJdbcUrl(dataSourceConfig.jdbcUrl)
+    
+    dataSourceConfig.postgresConfig.driverClassName.foreach(i => {
+        logger.info(s"Using driver class name: $i")
+        hikariDataSource.setDriverClassName(i)
+      })
 
     val hookFunctions = List(
       dataSourceConfig.postgresConfig.synchronousCommit.toList
@@ -71,7 +79,7 @@ class PostgresDataSourceStorageBackend(
         .map(i => exe(s"SET tcp_keepalives_count TO $i")),
       connectionInitHook.toList,
     ).flatten
-    InitHookDataSourceProxy(pgSimpleDataSource, hookFunctions, loggerFactory)
+    InitHookDataSourceProxy(hikariDataSource, hookFunctions, loggerFactory)
   }

Summarizing the changes:

  1. Use of HikariDataSource instead of PGSimpleDataSource, as PGSimpleDataSource doesn’t support the use of aws jdbc driver.
  2. Addition of an optional property ‘driverClassName’ to the existing Ledger API PostgresDataSourceConfig. This can be used to specify ‘software.amazon.jdbc.Driver’ in this case.
  3. Set ledger-api-jdbc-url manually to include the “jdbc:aws-wrapper:” url format.
  4. Set the jdbcUrl and the driverClassName for the HikariDataSource

Basic testing seems to be fine, full fledged testing is in progress, will update your further.

Let me know what do you think about these changes, happy to open a PR if everything goes good.

Any updates on this?

Ah sorry. Missed your update. Let me check it quickly.

It seems to cause some issues with lock allocation in the HA coordinator. Did it work for you?

Haven’t seen any errors till now, do you see any errors in the logs? Please post more details, I can check.

Yes, so I’ve looked at some of the tests. Effectively there is a high level and a fundamental problem with the change. The high level is likely that the pool never gets closed and therefore doesn’t release the database lock, which breaks HA failover.

I then checked with the author of that part and his response was:

except we want to have a hikari pool backed by another hikari pool, I would not do it. The purpose of DataSourceStorageBackend.createDataSource is to create the the pristine/simple/most importantly NOT POOLED data source, which will be used appropriately later for example put in as an input for a hikari pool

So it seems to me that this is a bit more invasive as you need to load a specific driver for AWS RDS. So instead of returning HikariDataSource, you will likely need to return AwsWrapperDataSource.

Depending on the variation of the configuration, this could be either done within the PG Storage Backend or create an explicit AwsRDSDataSourceStorageBackend: aws-advanced-jdbc-wrapper/docs/using-the-jdbc-driver/DataSource.md at f5b9dd63a894c21d5319513856ff3581d9747ddc · aws/aws-advanced-jdbc-wrapper · GitHub

Ideally, we’d load the AWS data source using reflection so we don’t need to link the JAR at compile time.

Actually, I asked me why it worked for you at all, as Canton has two storage backends (historical reasons, we are working on getting rid of one). The other one seems to automatically figure out which data source to use: canton/community/base/src/main/scala/com/digitalasset/canton/resource/Storage.scala at b5183318993b0201676627ad78ce85c88e9e64b4 · digital-asset/canton · GitHub

Yeah, so this is a bit more involved

Thanks for the insights and makes sense. Let me try something along these lines and get back.