Discussion:
log4net rolling file appender losing log files during roll over
Douglas Neary
2014-01-14 22:36:58 UTC
Permalink
All:

Under heavy strain, I am seeing log4net lose log files during roll over. I filed a bug yesterday (https://issues.apache.org/jira/browse/LOG4NET-416), and I wanted to offer my help (and hardware) to help track it down. I attached the internal log4net log to the ticket.

The hardware we are using is pretty high end (http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=4231377#!tab=features). We are using this beast as a combined image processor and web server, running IIS 7.5 on Windows Server 2008 R2.

For every transaction processed, a line is written to log4net. On average, we are probably doing around 7 transactions per second. This works fine for a while, until it doesn't. Then it starts losing log files during rollover.

Given all the value we have gotten out of log4net, my company is willing to donate some of my time and some time on that DL980 to test out the fixes...

Is there anyone who can give me a push in the right direction?

Thanks,
Doug
Dominik Psenner
2014-01-15 07:47:05 UTC
Permalink
Good morning,



I've looked at the logfile [2] and all starts with the .log file not being
rolled cause the file is still open and thus can't be moved. The following
warnings are caused by the first error. So this is where the trouble starts:



log4net: Moving [c:\logs\mobile_deposit_web.log.5] ->
[c:\logs\mobile_deposit_web.log.6]

log4net: Moving [c:\logs\mobile_deposit_web.log.4] ->
[c:\logs\mobile_deposit_web.log.5]

log4net: Moving [c:\logs\mobile_deposit_web.log.3] ->
[c:\logs\mobile_deposit_web.log.4]

log4net: Moving [c:\logs\mobile_deposit_web.log.2] ->
[c:\logs\mobile_deposit_web.log.3]

log4net: Moving [c:\logs\mobile_deposit_web.log.1] ->
[c:\logs\mobile_deposit_web.log.2]

log4net: Moving [c:\logs\mobile_deposit_web.log] ->
[c:\logs\mobile_deposit_web.log.1]

log4net:ERROR [RollingFileAppender] ErrorCode: GenericFailure. Exception
while rolling file [c:\logs\mobile_deposit_web.log] ->
[c:\logs\mobile_deposit_web.log.1]

System.IO.IOException: The process cannot access the file because it is
being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.__Error.WinIOError()

at System.IO.File.Move(String sourceFileName, String destFileName)

at log4net.Appender.RollingFileAppender.RollFile(String fromFile, String
toFile)

log4net:ERROR RollingFileAppender: INTERNAL ERROR. Append is False but
OutputFile [c:\logs\mobile_deposit_web.log] already exists.

log4net: Opening file for writing [c:\logs\mobile_deposit_web.log] append
[False]

log4net: rolling over count [104857887]

log4net: maxSizeRollBackups [100]

log4net: curSizeRollBackups [8]

log4net: countDirection [-1]



My first guess is that there are multiple processes logging into the same
file. That would mean that one process may holds a write lock on the file
while another one tries to roll the file. We have seen that people often are
not aware of the IIS app domain pool and its impact on the appender
configuration. If I'm right, then your usecase hints that there are probably
three ways to resolve the issue:



* Use the InterProcessLock on the rolling file appender

o this might produce a bottleneck for your high performance demands

* Use remote appenders that sink log events to a remote process that
appends the events to the file

o This might be a nice try

* Replace the rolling file appender with an appender that sinks
events into a database or another storage that handles the write locks

o Probably easier to configure than the remote appenders but the logs are
no longer stored as plain text in a file



Cheers,

D.



Von: Douglas Neary [mailto:***@miteksystems.com]
Gesendet: Dienstag, 14. Januar 2014 23:37
An: Log4NET User
Betreff: log4net rolling file appender losing log files during roll over



All:



Under heavy strain, I am seeing log4net lose log files during roll over. I
filed a bug yesterday (https://issues.apache.org/jira/browse/LOG4NET-416),
and I wanted to offer my help (and hardware) to help track it down. I
attached the internal log4net log to the ticket.



The hardware we are using is pretty high end
(http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=
4231377#!tab=features). We are using this beast as a combined image
processor and web server, running IIS 7.5 on Windows Server 2008 R2.



For every transaction processed, a line is written to log4net. On average,
we are probably doing around 7 transactions per second. This works fine for
a while, until it doesn't. Then it starts losing log files during
rollover.



Given all the value we have gotten out of log4net, my company is willing to
donate some of my time and some time on that DL980 to test out the fixes.



Is there anyone who can give me a push in the right direction?



Thanks,
Doug
Douglas Neary
2014-01-16 22:23:59 UTC
Permalink
Hi Dominik:

Thanks for the reply.
Agreed.
My first guess is that there are multiple processes logging into the same file.
I don't believe that is the case. Let me explain further...

This is an Asp.net application (.Net 3.5.1) with 3 separate web sites logging to 3 separate log files using the rolling file appender.
To be more specific, see the table below:

Folder

Log File Name

App Pool

c:\inetpub\wwwroot\mobiledeposit

C:\logs\mobile_deposit_web.log

DefaultAppPool

C:\inetpub\wwwroot\remotecarlar

C:\logs\mobile_deposit_remotecarlar.log

Remote_CarLar

C:\inetpub\wwwroot\remote_RE

C:\logs\mobile_deposit_remoteRE.log

Remote_RE


Since each of these 3 sites is running in a separate App Pool, they are each running in a separate (32 bit) w3wp.exe. There are 3 workers, each named w3wp.exe with different PIDs. . I added a screen shot from Process Explorer to the Jira ticket to show you what I mean.

App Domain Recycling has been disabled for each of these sites (I added another screen shot to Jira). The Jira url for this issue is https://issues.apache.org/jira/browse/LOG4NET-416.

So my questions:

1. Am I doing anything obviously wrong?

2. Assuming the answer to #1 is wrong, how can I help further?

Thanks,
Doug


p.s. I chopped off the rest of this email, please let me know if that was a bad idea.
Dominik Psenner
2014-01-17 08:23:09 UTC
Permalink
It would be worth checking if your assumption is true by logging the process
id and the thread name (pattern layouts %processid and %thread).



Cheers



Von: Douglas Neary [mailto:***@miteksystems.com]
Gesendet: Donnerstag, 16. Januar 2014 23:24
An: Log4NET User
Betreff: RE: log4net rolling file appender losing log files during roll over



Hi Dominik:



Thanks for the reply.
Post by Dominik Psenner
I've looked at the logfile [2] and all starts with the .log file not
being rolled cause the file is still open and thus can't be moved. The
following warnings are caused by the first error. So this is where the
trouble starts:



Agreed.
Post by Dominik Psenner
My first guess is that there are multiple processes logging into the same file.
I don't believe that is the case. Let me explain further.



This is an Asp.net application (.Net 3.5.1) with 3 separate web sites
logging to 3 separate log files using the rolling file appender.

To be more specific, see the table below:




Folder

Log File Name

App Pool


c:\inetpub\wwwroot\mobiledeposit

C:\logs\mobile_deposit_web.log

DefaultAppPool


C:\inetpub\wwwroot\remotecarlar

C:\logs\mobile_deposit_remotecarlar.log

Remote_CarLar


C:\inetpub\wwwroot\remote_RE

C:\logs\mobile_deposit_remoteRE.log

Remote_RE



Since each of these 3 sites is running in a separate App Pool, they are each
running in a separate (32 bit) w3wp.exe. There are 3 workers, each named
w3wp.exe with different PIDs. . I added a screen shot from Process
Explorer to the Jira ticket to show you what I mean.



App Domain Recycling has been disabled for each of these sites (I added
another screen shot to Jira). The Jira url for this issue is
https://issues.apache.org/jira/browse/LOG4NET-416.



So my questions:

1. Am I doing anything obviously wrong?

2. Assuming the answer to #1 is wrong, how can I help further?



Thanks,

Doug





p.s. I chopped off the rest of this email, please let me know if that was a
bad idea.
d_k
2014-01-18 00:08:41 UTC
Permalink
If i'm not mistaking the screenshot you attached shows that "Disable
Overlapped Recycle" is false which means it is on.
In addition to adding %processid as Dominik suggested you can consider
using that screen to configure IIS write an event log entry each time it
recycles so we could try and correlate that with the failing rolls.
Post by Dominik Psenner
It would be worth checking if your assumption is true by logging the
process id and the thread name (pattern layouts %processid and %thread).
Cheers
*Gesendet:* Donnerstag, 16. Januar 2014 23:24
*An:* Log4NET User
*Betreff:* RE: log4net rolling file appender losing log files during roll
over
Thanks for the reply.
I’ve looked at the logfile [2] and all starts with the .log file not
being rolled cause the file is still open and thus can’t be moved. The
following warnings are caused by the first error. So this is where the
Agreed.
My first guess is that there are multiple processes logging into the
same file.
I don’t believe that is the case. Let me explain further…
This is an Asp.net application (.Net 3.5.1) with 3 separate web sites
logging to 3 separate log files using the rolling file appender.
Folder
Log File Name
App Pool
c:\inetpub\wwwroot\mobiledeposit
C:\logs\mobile_deposit_web.log
DefaultAppPool
C:\inetpub\wwwroot\remotecarlar
C:\logs\mobile_deposit_remotecarlar.log
Remote_CarLar
C:\inetpub\wwwroot\remote_RE
C:\logs\mobile_deposit_remoteRE.log
Remote_RE
Since each of these 3 sites is running in a separate App Pool, they are
each running in a separate (32 bit) w3wp.exe. There are 3 workers, each
named w3wp.exe with different PIDs. . I added a screen shot from Process
Explorer to the Jira ticket to show you what I mean.
App Domain Recycling has been disabled for each of these sites (I added
another screen shot to Jira). The Jira url for this issue is
https://issues.apache.org/jira/browse/LOG4NET-416.
1. Am I doing anything obviously wrong?
2. Assuming the answer to #1 is wrong, how can I help further?
Thanks,
Doug
p.s. I chopped off the rest of this email, please let me know if that was a bad idea.
Svante Seleborg
2014-01-15 07:58:48 UTC
Permalink
Hello,

There is a known issue with log4net when you have mulitple sites logging to
the same file causing this exact situation (among other issues). Please
ensure this is not the case in your configuration. Just guessing, but I
might also guess that you'd risk seeing the same problem under heavy load
of a single site. Both of your missing files appear to happen due to
problems accessing the main log file when it is about to be rolled over for
the first time. This then propagates for each roll over, causing them to
end up as #7 and #26 missing in the snapshot you provided.

If your problem is caused by multiple sites, you may want to try
http://log4netremotelogging.codeplex.com/ which I wrote a few years ago to
handle this and other issues with consolidating logs from several
sites/servers.

Svante
Post by Douglas Neary
Under heavy strain, I am seeing log4net lose log files during roll over.
I filed a bug yesterday (https://issues.apache.org/jira/browse/LOG4NET-416),
and I wanted to offer my help (and hardware) to help track it down. I
attached the internal log4net log to the ticket.
The hardware we are using is pretty high end (
http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=4231377#!tab=features).
We are using this beast as a combined image processor and web server,
running IIS 7.5 on Windows Server 2008 R2.
For every transaction processed, a line is written to log4net. On
average, we are probably doing around 7 transactions per second. This
works fine for a while, until it doesn’t. Then it starts losing log files
during rollover.
Given all the value we have gotten out of log4net, my company is willing
to donate some of my time and some time on that DL980 to test out the fixes…
Is there anyone who can give me a push in the right direction?
Thanks,
Doug
Douglas Neary
2014-01-16 22:31:17 UTC
Permalink
There is a known issue with log4net when you have mulitple sites logging to the same file
I don't think I'm doing that (see my last post to the list).
Just guessing, but I might also guess that you'd risk seeing the same problem under heavy load of a single site.
I have not seen that (yet). I can't help but wonder if 3 processes named w3wp.exe with different PIDs is a factor here.
Both of your missing files appear to happen due to problems accessing the main log file when it is about to be rolled over for the first time. This then propagates for each roll over, causing them to end up as #7 and #26 missing in the snapshot you provided.
Agreed.
If your problem is caused by multiple sites, you may want to try http://log4netremotelogging.codeplex.com/ which I wrote a few years ago to handle this and other issues with consolidating logs from several sites/servers.
I may end up doing this eventually, but I'd rather give a try at fixing the rolling file appender. This software is installed at customer sites, and they are very picky about anything other than text files on the local servers.

Thank you again.

Doug

From: Svante Seleborg [mailto:***@axantum.com]
Sent: Tuesday, January 14, 2014 11:59 PM
To: Log4NET User
Subject: Re: log4net rolling file appender losing log files during roll over

Hello,

There is a known issue with log4net when you have mulitple sites logging to the same file causing this exact situation (among other issues). Please ensure this is not the case in your configuration. Just guessing, but I might also guess that you'd risk seeing the same problem under heavy load of a single site. Both of your missing files appear to happen due to problems accessing the main log file when it is about to be rolled over for the first time. This then propagates for each roll over, causing them to end up as #7 and #26 missing in the snapshot you provided.

If your problem is caused by multiple sites, you may want to try http://log4netremotelogging.codeplex.com/ which I wrote a few years ago to handle this and other issues with consolidating logs from several sites/servers.

Svante

On Tue, Jan 14, 2014 at 11:36 PM, Douglas Neary <***@miteksystems.com<mailto:***@miteksystems.com>> wrote:
All:

Under heavy strain, I am seeing log4net lose log files during roll over. I filed a bug yesterday (https://issues.apache.org/jira/browse/LOG4NET-416), and I wanted to offer my help (and hardware) to help track it down. I attached the internal log4net log to the ticket.

The hardware we are using is pretty high end (http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=4231377#!tab=features). We are using this beast as a combined image processor and web server, running IIS 7.5 on Windows Server 2008 R2.

For every transaction processed, a line is written to log4net. On average, we are probably doing around 7 transactions per second. This works fine for a while, until it doesn't. Then it starts losing log files during rollover.

Given all the value we have gotten out of log4net, my company is willing to donate some of my time and some time on that DL980 to test out the fixes...

Is there anyone who can give me a push in the right direction?

Thanks,
Doug
Loading...