5

The Issue: I am unable to observe the asynchronous capabilities of Log4J2 after initializing SparkContext in Spark local mode.

Log4j2 dependencies in SBT:

  "com.lmax" % "disruptor" % "3.3.5",
  "org.apache.logging.log4j" % "log4j-api" % "2.8.2",
  "org.apache.logging.log4j" % "log4j-core" % "2.8.2",
  "org.apache.logging.log4j" %% "log4j-api-scala" % "2.8.2"

Log4j2 config file:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="debug">
<Appenders>
    <Console name="Console-Appender" target="SYSTEM_OUT">
        <PatternLayout>
            <pattern>
                [%-5level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n
            </pattern>>
        </PatternLayout>
    </Console>
    <File name="File-Appender" fileName="logs/xmlfilelog.log" >
        <PatternLayout>
            <pattern>
                [%-5level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %c{1} - %msg%n
            </pattern>
        </PatternLayout>
    </File>
</Appenders>
<Loggers>
    <Logger  name="guru.springframework.blog.log4j2async" level="debug">
        <AppenderRef ref="File-Appender"/>he preceding c
    </Logger>
    <Root level="debug">
        <AppenderRef ref="Console-Appender"/>
    </Root>
</Loggers>

I set the following system property in IntelliJ

-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector

To test asynchronous behavior, I ran the following code snippet before and after initializing SparkContext:

val start = System.nanoTime()

for(i <- 1 to 1000) {
  logger.error("Hello")
}

val end = System.nanoTime()
val timeMS = (end - start)/1000000
println(s"Parsed ${iterations} reports in ${timeMS} ms ${timeMS/1000} sec")

Successful result: I was able to see the following debug line confirming AsyncContext was enabled 2017-04-25 14:55:40,541 main DEBUG LoggerContext[name=AsyncContext@6d9c638, org.apache.logging.log4j.core.async.AsyncLoggerContext@758f4f03] started OK.. Additionally, my print line statement, "Parsed..." displayed somewhere in the middle of the logger output, indicating async behavior.

However, once I moved this code block after the SparkContext initialization, I no longer observer async activity. The print line statement, "Parsed..." always displays after the logger output. What is confusing is that 2017-04-25 14:55:40,541 main DEBUG LoggerContext[name=AsyncContext@6d9c638, org.apache.logging.log4j.core.async.AsyncLoggerContext@758f4f03] started OK.. is still observed.

For a sanity check, I also printed the package of the logger api inside the scope of the spark application. The result yielded package org.apache.logging.log4j.scala, Apache Log4j Scala 2.11 wrapper for Log4j API, version 2.8.2. According to the Log4J2 docs, "og4j 2 Scala API is dependent on the Log4j 2 API." Thus, I believe I can deduce that I am actually using Log4J2.

MAIN QUESTION: IS SPARK SOMEHOW INTERCEPTING MY LOG4J2 CONFIGURATION AND DEFAULTING TO ITS LOG4J CONFIGURATION, OR DOES IT NOT ALLOW ASYNC LOGGING BEHAVIOR?

1
  • It's an old post so just wanted to mention that I tested it locally and it didn't reproduce. Spark 2.4.5 and log4j 2.17.0. Async seems to work properly. Commented Dec 23, 2021 at 18:59

1 Answer 1

1

The only thing that you should make sure is that the system property is set before Log4j2 is initialized.

-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector

It seems to me that this is done correctly: If you can see that an AsyncLoggerContext is used then you can be sure you are using Disruptor-backed Async Loggers.

Sign up to request clarification or add additional context in comments.

4 Comments

Yes, it appears AsyncLoggerContext is used even in the executors. I see the lines 2017-04-26 10:11:15,660 Executor task launch worker-7 DEBUG LoggerContext[name=DefaultAsyncContext@Executor task launch worker-6, org.apache.logging.log4j.core.async.AsyncLoggerContext@56465954] started OK. 10:11:15.662 [Executor task launch worker-7] ERROR LoggingTestWithRDD - Hello. However, async behavior is not occurring. Perhaps it has something to do with Spark local mode..
The problem is likely in how you are trying to verify async behavior. Try a simpler print statement println("This is System.out.println") (without string interpolation), and increase the number of iterations in the for loop to 10K or 50K.
That is a bit odd indeed. Have you tried asking on the Spark mailing list?
Just posted to the mailing list

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.