Server autorestart #55

buyolitsez · 2025-12-03T12:24:31Z

Hi, I use cmus with cmusfm on macos. At first, thank you for your script! However, sometimes it stops scrobbling for me after some time, and the solution is to manually restart the cmusfm server process. So, I tried to add auto-restart in the cmusfm to make it automatic. I'm not sure that I did it properly, but now it works for me without any problems, so I guess the problem should be fixed

Could you please check the PR?

arkq · 2025-12-03T13:08:45Z

However, sometimes it stops scrobbling for me after some time, and the solution is to manually restart the cmusfm server process.

Hmm..., it would be nice to know what is the actual cause of the server not being started.

src/server.c

+			ssize_t inot_read = read(pfds[2].fd, &inot_even, sizeof(inot_even));
+			if (inot_read > 0) {
+				debug("Inotify event occurred: %x", inot_even.mask);
+				cmusfm_config_read(cmusfm_config_file, &config);


src/server.c

-			if (rd_len >= sizeof(struct cmusfm_data_record))
+			if (rd_len > 0 && rd_len >= sizeof(struct cmusfm_data_record))
 				cmusfm_server_process_data(sbs, (struct cmusfm_data_record *)buffer);
+			else if (rd_len < 0)


buyolitsez · 2025-12-03T13:14:02Z

However, sometimes it stops scrobbling for me after some time, and the solution is to manually restart the cmusfm server process.

Hmm..., it would be nice to know what is the actual cause of the server not being started.

I'll run the main branch now and wait for some time to see the debug logs. I'll post them here then

Copilot

Pull request overview

This pull request implements automatic server restart functionality to address intermittent scrobbling issues on macOS. The changes add error recovery mechanisms in both the server main loop and the client process that spawns the server.

Improves error handling in the server's main loop with better errno checking for poll, accept, read, and inotify operations
Adds automatic restart logic with exponential backoff (up to 5 retries) when the server crashes
Implements retry mechanism for track sending when the server becomes unavailable

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File	Description
src/server.c	Enhanced error handling for poll, accept, read, and inotify operations with errno checking; added logic to close previous client connections before accepting new ones
src/main.c	Implemented auto-restart loop with exponential backoff in forked server process; added retry logic to restart server and resend track when initial send fails

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-03T13:14:21Z

src/main.c

+			if (pid == 0)
+				return cmusfm_server_start();


The retry logic spawns a new server process without the restart mechanism. When the send fails and the server needs to be restarted (lines 264-265), the forked child process directly calls cmusfm_server_start() without the retry loop that was added earlier (lines 234-246). This creates inconsistent behavior where the initial server startup has auto-restart capabilities, but the recovery path does not.

The child process should use the same restart loop logic for consistency.

Copilot · 2025-12-03T13:14:21Z

src/main.c

+		if (cmusfm_server_check() == 0) {
+			pid_t pid;
+			if ((pid = fork()) == -1) {
+				perror("ERROR: Fork server (retry)");
+				return EXIT_FAILURE;
+			}
+			if (pid == 0)
+				return cmusfm_server_start();
+			sleep(1);
+			/* Retry sending track */
+			if (cmusfm_server_send_track(tinfo) != 0) {
+				perror("ERROR: Send track (after restart)");
+				return EXIT_FAILURE;
+			}
+		} else {
+			perror("ERROR: Send track");
+			return EXIT_FAILURE;
+		}


Potential zombie process creation. When the retry mechanism spawns a new server process (line 260-265), the parent process does not wait for or track the child process. This is different from the initial server spawn (line 223) where the parent continues to execute but should eventually clean up child processes.

In the retry path, after forking, the parent should either track the child PID or ensure proper process management to avoid zombie processes accumulating over time.

Copilot · 2025-12-03T13:14:22Z

src/server.c

 			close(pfds[1].fd);
 			pfds[1].fd = -1;
-			if (rd_len >= sizeof(struct cmusfm_data_record))
+			if (rd_len > 0 && rd_len >= sizeof(struct cmusfm_data_record))


[nitpick] Redundant check in read length validation. The condition rd_len > 0 && rd_len >= sizeof(struct cmusfm_data_record) at line 377 has a redundant check. If rd_len >= sizeof(struct cmusfm_data_record) is true, then rd_len > 0 is always true since sizeof(struct cmusfm_data_record) is guaranteed to be positive.

Simplify to: if (rd_len >= sizeof(struct cmusfm_data_record))

Suggested change

if (rd_len > 0 && rd_len >= sizeof(struct cmusfm_data_record))

if (rd_len >= sizeof(struct cmusfm_data_record))

Copilot · 2025-12-03T13:14:22Z

src/main.c

+				restart_count++;
+				if (restart_count < max_restarts) {
+					sleep(delay);
+					delay = delay * 2;


The exponential backoff delay calculation will overflow for restart counts >= 6. The delay starts at 1 second and doubles each time (line 244: delay = delay * 2). With max_restarts = 5, the delays would be: 1s, 2s, 4s, 8s. However, if this logic were extended or modified, larger values could overflow an int.

Consider using a maximum cap for the delay value (e.g., delay = (delay * 2 < 60) ? delay * 2 : 60;) to prevent unbounded growth.

Suggested change

delay = delay * 2;

delay = (delay * 2 < 60) ? delay * 2 : 60;

Copilot · 2025-12-03T13:14:22Z

src/server.c

+			}
 			pfds[1].fd = accept(pfds[0].fd, NULL, NULL);
+			if (pfds[1].fd == -1) {
+				debug("Accept error: %s", strerror(errno));


Missing error handling for accept(). If accept() is interrupted by a signal (errno == EINTR), the current code logs the error and continues (lines 365-369), which is correct. However, for other error conditions like EMFILE (too many open files) or ENOMEM (out of memory), continuing the loop without any backoff or mitigation could lead to a tight loop consuming CPU resources.

Consider adding a small delay (e.g., usleep(100000) for 100ms) before continuing when accept() fails with certain error conditions to prevent spinning.

Suggested change

debug("Accept error: %s", strerror(errno));

debug("Accept error: %s", strerror(errno));

/* If accept fails due to resource exhaustion, add a small delay to prevent CPU spinning */

if (errno == EMFILE || errno == ENOMEM) {

usleep(100000); /* 100ms */

}

Copilot · 2025-12-03T13:14:23Z

src/main.c

+		if (pid == 0) {
+			/* Server process - restart on error */
+			int restart_count = 0;
+			const int max_restarts = 5;
+			int delay = 1; /* Start with 1 second delay */
+
+			while (restart_count < max_restarts) {
+				int result = cmusfm_server_start();
+				if (result == 0) {
+					/* Normal exit (shutdown signal) */
+					return EXIT_SUCCESS;
+				}
+				/* Server crashed - restart with backoff */
+				restart_count++;
+				if (restart_count < max_restarts) {
+					sleep(delay);
+					delay = delay * 2;
+				}
+			}
+			/* Too many restarts - give up */
+			return EXIT_FAILURE;
+		}


The new restart mechanism and retry logic lack test coverage. The auto-restart functionality (lines 230-246) and the retry-on-failure logic (lines 257-275) are significant behavioral changes that could fail silently or cause issues in production. Given that the repository has comprehensive test suites for server functionality (see test/test-server-*.c files), these new code paths should be covered by tests to ensure they work correctly under various failure scenarios.

Copilot · 2025-12-03T13:14:23Z

src/server.c

+			/* Close previous client connection if any */
+			if (pfds[1].fd != -1) {
+				debug("Closing previous client connection: %d", pfds[1].fd);
+				close(pfds[1].fd);


File descriptor not reset after closing. After closing the previous client connection (line 362), the code does not set pfds[1].fd = -1. This means that on the next iteration, if a new connection arrives, the check if (pfds[1].fd != -1) at line 360 will be true even though the fd was already closed, potentially leading to a double-close error or attempting to close an fd that has been reused by the system for another purpose.

Add pfds[1].fd = -1; after line 362 to properly reset the file descriptor.

Suggested change

close(pfds[1].fd);

close(pfds[1].fd);

pfds[1].fd = -1;

Copilot · 2025-12-03T13:14:23Z

src/server.c

 	debug("Entering server main loop");
 	while (server_on) {

-		if (poll(pfds, 3, -1) == -1)
-			break;  /* signal interruption */
+		int poll_result = poll(pfds, 3, -1);
+		if (poll_result == -1) {
+			if (errno == EINTR) {
+				/* Interrupted by signal, continue loop */
+				continue;
+			}
+			/* Other poll errors - log and exit */
+			debug("Poll error: %s", strerror(errno));
+			retval = -1;
+			break;
+		}


The static variable server_on is not reset between restarts. The server_on variable (declared as static bool server_on = true; in the file) is set to false by the signal handler when the server is stopped. However, when the server is restarted by the retry logic in main.c (lines 234-246), this variable remains false from the previous run, causing the new server instance to exit immediately without entering the main loop.

The server_on variable should be reset to true at the beginning of cmusfm_server_start() to ensure it works correctly across restarts.

Copilot · 2025-12-03T13:14:23Z

src/server.c

+		int poll_result = poll(pfds, 3, -1);
+		if (poll_result == -1) {
+			if (errno == EINTR) {
+				/* Interrupted by signal, continue loop */


[nitpick] Potential issue with checking EINTR during signal-based shutdown. When a signal (SIGTERM, SIGHUP, or SIGINT) is received, the signal handler sets server_on = false and may interrupt the poll() call with EINTR. However, the current logic handles EINTR by continuing the loop (lines 348-351). This means that when the signal handler sets server_on = false and poll() returns with EINTR, the code will continue, check server_on in the while condition (line 344), and exit normally with retval = 0.

While this appears to work correctly, it's worth noting that this behavior is implicit and may not be immediately obvious. The signal handling works as intended, but the interaction between EINTR handling and server_on could be documented with a comment.

Suggested change

/* Interrupted by signal, continue loop */

/*

* EINTR may be caused by a signal (e.g., SIGTERM, SIGHUP, SIGINT).

* The signal handler sets server_on = false, so the next loop

* iteration will exit cleanly. This continue ensures we re-check

* server_on before proceeding.

*/

Server autorestart

79801f8

arkq requested a review from Copilot December 3, 2025 13:08

Copilot started reviewing on behalf of arkq December 3, 2025 13:09 View session

github-advanced-security bot found potential problems Dec 3, 2025

View reviewed changes

Copilot finished reviewing on behalf of arkq December 3, 2025 13:12

Copilot AI reviewed Dec 3, 2025

View reviewed changes

	if (rd_len > 0 && rd_len >= sizeof(struct cmusfm_data_record))
	if (rd_len >= sizeof(struct cmusfm_data_record))

	delay = delay * 2;
	delay = (delay * 2 < 60) ? delay * 2 : 60;

-				debug("Accept error: %s", strerror(errno));
+				debug("Accept error: %s", strerror(errno));
+				/* If accept fails due to resource exhaustion, add a small delay to prevent CPU spinning */
+				if (errno == EMFILE || errno == ENOMEM) {
+					usleep(100000); /* 100ms */
+				}

-				/* Interrupted by signal, continue loop */
+				/*
+				 * EINTR may be caused by a signal (e.g., SIGTERM, SIGHUP, SIGINT).
+				 * The signal handler sets server_on = false, so the next loop
+				 * iteration will exit cleanly. This continue ensures we re-check
+				 * server_on before proceeding.
+				 */

Server autorestart #55

Are you sure you want to change the base?

Server autorestart #55

Uh oh!

Conversation

buyolitsez commented Dec 3, 2025

Uh oh!

arkq commented Dec 3, 2025

Uh oh!

Check failure

Uh oh!

Uh oh!

Check warning

buyolitsez commented Dec 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants