Forum: Dock Sud BBS

Networked message handling/thread going to 100% cpu & no imports

From Khelair@VERT to All on Sat Dec 6 21:36:44 2014

I know I've mentioned this before, but the bug in synchronet that a few people have talked about that pegs a thread @ 100% of cpu usage after one of the networks (I believe- though this is a glorified assumption at this point) tries to pull messages, has been bugging me a lot more often recently. Basically at least once a day now I'm finding that after a prolonged period of time no messages have been imported to any of the networked subs, and inevitably, after I check the cpu stats, the sbbs process is pegged at 100%. A kill -15 won't kill it, after awhile I kill -9 it, restart it, and things seem to be working again. This time around I haven't noticed any particular sub-boards being corrupted in the process, but I've trimmed down the number of sub-boards that I'm reading lately due to not enough time, and FIDONet not posting anything for me in an error that my RC doesn't seem to be able to help me fix.
So I'm not sure exactly which networked function it may be, but when it happens it shuts down importing of all networked messages across 5 networks. It's really a hinderance, and I'd rather not have to fall back on setting up a shell script to run every hour to check for pegged usage for too long and then kill it off and restart it. That just can't be good for anything.
Can anybody give me some more information on how to get around this, since I still can't get a more recent version compiled on OBSD? I guess I could just default to disabling networked bases, one at a time (my preliminary suspect is FIDO), until it doesn't seem to happen any more, but that seems like it'd be unreliable and a really time-consuming way to get to the bottom of this.
Any input appreciated.

---
� Synchronet � Tinfoil Tetrahedron BBS telnet://

From Access Denied@VERT to Khelair on Sun Dec 7 09:14:42 2014

Hello Khelair,

On 06 Dec 14 21:36, Khelair wrote to All:

Can anybody give me some more information on how to get
around this, since I still can't get a more recent version compiled on OBSD? I guess I could just default to disabling networked bases, one
at a time (my preliminary suspect is FIDO), until it doesn't seem to happen any more, but that seems like it'd be unreliable and a really time-consuming way to get to the bottom of this. Any input
appreciated.

When this is occurring, take a look in your /sbbs/data/ directory for *.now. If
one exists (usually fidoin.now or fidoout.now or something similar) no other events will run until that one is done. So if no others are running, and one of
those .now files exist, *that* is the one causing other events not to run.

With that, you can narrow down exactly which event is doing this. After you know that, and if it's fidoin.now, you can check /sbbs/data/sbbsecho.log for any errors importing messages during that timeframe. If it's fidoout.now check the same log for exporting errors.

Sometimes it's a DOS event while processing door games for interBBS. If one game hangs during processing, it will stay locked up, and your DOS emulator would continue to run, pinging your CPU at 100%. Then again, you're running OpenBSD so you may not have any DOS games being processed, unless you're using something like DOSCMD or maybe got DOSEMU to compile for it..

Regards,
Nick

--- GoldED+/LNX 1.1.5-b20

From Digital Man@VERT to Khelair on Sun Dec 7 18:29:19 2014

Re: Networked message handling/thread going to 100% cpu & no imports
By: Khelair to All on Sat Dec 06 2014 09:36 pm

I know I've mentioned this before, but the bug in synchronet that a few people have talked about that pegs a thread @ 100% of cpu usage after one
of the networks (I believe- though this is a glorified assumption at this point) tries to pull messages, has been bugging me a lot more often recently. Basically at least once a day now I'm finding that after a prolonged period of time no messages have been imported to any of the networked subs, and inevitably, after I check the cpu stats, the sbbs process is pegged at 100%. A kill -15 won't kill it, after awhile I kill
-9 it, restart it, and things seem to be working again. This time around I haven't noticed any particular sub-boards being corrupted in the process, but I've trimmed down the number of sub-boards that I'm reading lately due to not enough time, and FIDONet not posting anything for me in an error
that my RC doesn't seem to be able to help me fix.
So I'm not sure exactly which networked function it may be, but when it happens it shuts down importing of all networked messages across 5
networks. It's really a hinderance, and I'd rather not have to fall back on setting up a shell script to run every hour to check for pegged usage for too long and then kill it off and restart it. That just can't be good for anything.
Can anybody give me some more information on how to get around this,
since I still can't get a more recent version compiled on OBSD? I guess I could just default to disabling networked bases, one at a time (my preliminary suspect is FIDO), until it doesn't seem to happen any more, but that seems like it'd be unreliable and a really time-consuming way to get
to the bottom of this.
Any input appreciated.

Are all of these "network" fidonet technology nets (FTNs)? If so, then the process that handles importing and exporting would be SBBSecho, not sbbs. Which
process exactly do you see with a 100% CPU utilization? What is the log output at the time that is occuring? What version of SBBS and SBBSecho are you using? Without more details, it's really hard to help.

digital man

Synchronet "Real Fact" #19:
Michael Swindell was directly responsible for Synchronet's commercial success. Norco, CA WX: 67.0�F, 54.0% humidity, 0 mph WSW wind, 0.00 inches rain/24hrs
--

From Access Denied@VERT to Khelair on Wed Dec 10 17:15:14 2014

Hello Khelair,

On 09 Dec 14 20:36, Khelair wrote to Access Denied:

Well I caught a couple of atypical ones now. Straight up crashes,
where I've got an open session and I come back awhile later and the connection is terminated. These ones appear to be happening right
around the time that qnet-qwk.now is being created, though they don't appear to have anything in the associated .lo? file.

For one, you don't ever have to associate QWK messages with .?lo files whatsoever. Two completely different transfer protocols. My question for you would be, are you hosting a QWK network? Or maybe it's when you're polling VERT
for Dovenet?

Maybe check your system log and see if there's any odd things going on right around the time it crashes.

Regards,
Nick

--- GoldED+/LNX 1.1.5-b20130910
* Origin: thePharcyde_ telnet://bbs.pharcyde.org (Wisconsin) (723:1/701)
� Synchronet � thePharcyde_ telnet://bbs.pharcyde.org (Wisconsin)

From Khelair@VERT to Access Denied on Wed Dec 10 21:41:22 2014

Re: Re: Networked message handling/thread going to 100% cpu & no imports
By: Access Denied to Khelair on Wed Dec 10 2014 17:15:14

don't appear to have anything in the associated .lo? file.

For one, you don't ever have to associate QWK messages with .?lo files whatsoever. Two completely different transfer protocols. My question for you would be, are you hosting a QWK network? Or maybe it's when you're polling VERT for Dovenet?

I meant what I said about .lo? files, as in the ones that accumulate in /sbbs/data/logs/*.lo? (.log & .lol).

Maybe check your system log and see if there's any odd things going on right around the time it crashes.

Yep, that's what I referenced doing in the above file extensions. ;)

---
� Synchronet � Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net

From Access Denied@VERT to Khelair on Thu Dec 11 17:12:44 2014

Hello Khelair,

On 10 Dec 14 21:41, Khelair wrote to Access Denied:

I meant what I said about .lo? files, as in the ones that accumulate
in /sbbs/data/logs/*.lo? (.log & .lol).

Maybe check your system log and see if there's any odd things
going on right around the time it crashes.

Yep, that's what I referenced doing in the above file extensions.
;)

I don't think those logs give you all information about your system, do they? Maybe you compiled it that way for your OS?

Otherwise, check your system log. I use syslog-ng on Gentoo here, and it logs to /var/log/messages (aside from the stuff in the /sbbs/data/logs directory).

Regards,
Nick

--- GoldED+/LNX 1.1.5-b20130910
* Origin: thePharcyde_

From mark lewis@VERT to Khelair on Thu Dec 11 22:15:51 2014

On Wed, 10 Dec 2014, Khelair wrote to Access Denied:

don't appear to have anything in the associated .lo? file.

For one, you don't ever have to associate QWK messages with .?lo files

someone confused .lo? files with .?lo files... the latter are binkley style mailer files ;)

)\/(ark

* Origin: (1:3634/12)
---
� Synchronet � Vertrauen � Home of Synchronet � telnet://vert.synchro.net

From Nicholas Boel@VERT to mark lewis on Thu Dec 11 22:57:06 2014

Hello mark,

On 11 Dec 14 22:15, mark lewis wrote to Khelair:

For one, you don't ever have to associate QWK messages with .?lo
files

someone confused .lo? files with .?lo files... the latter are binkley style mailer files ;)

I did. But then again, I originally wasn't referring to anything in /sbbs/data/logs, either. I was referring to the system log (ie: /var/log/messages in some Linux distros, journalctl on Archlinux, etc. ie2: your SYSTEM log, not your BBS logs, and if installed normally, Synchronet will automatically log to your system logs if you don't tell it not to, or don't run
as a daemon.

Regards,
Nick

--- GoldED+/LNX 1.1.5-b20130910
* Origin: thePharcyde_ telnet://bbs.pharcyde.org (Wisconsin)

Who's Online

System Info

Sysop:	Ragnarok
Location:	Dock Sud, Bs As, Argentina
Users:	137
Nodes:	10 (0 / 10)
Uptime:	04:05:04
Calls:	15,498
Files:	20,142
D/L today:	6 files (265K bytes)
Messages:	1,870,449

Networked message handling/thread going to 100% cpu & no imports

Who's Online

System Info