split a slice into n slices

Your algorithm uses step as the size of batches:

step := (len(t.portsToScan[proto]) + t.workers - 1) / t.workers

This isn’t the optimal size. For example if you have 4 ports to scan and 3 workers, this results in step = 2, which means you’ll have only 2 jobs (2+2=4). But it would be better (more optimal) to have 3 batches (with sizes 2+1+1=4).

So the size of batches should be

defSize := len(t.portsToScan[proto]) / t.workers

The problem with this is that if the length is not a multiple of t.workers, some of the last elements (ports) will not be assigned to any of the jobs. Using defSize+1 for all jobs would be too many.

So the optimal solution is in the “middle”: some jobs will have defSize ports to scan, and some will have defSize+1. How many must have defSize+1? As many as missing if all would have defSize:

numBigger := len(t.portsToScan[proto]) - defSize*t.workers

Note that if there are less ports to scan than workers, the above calculation yields defSize=0, so some workers would get 0 ports to scan, and some would get 1. That’s OK, but you shouldn’t add jobs with 0 ports to scan.

Using this distribution:

defSize := len(t.portsToScan[proto]) / t.workers
numBigger := len(t.portsToScan[proto]) - defSize*t.workers

size := defSize+1
for i, idx := 0, 0; i < t.workers; i++ {
    if i == numBigger {
        size--
        if size == 0 {
            break // 0 ports left to scan
        }
    }
    jobs = append(jobs, jobMsg{
        ip:       t.ip,
        protocol: proto,
        ports:    t.portsToScan[proto][idx : idx+size],
    })
    idx += size
}

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top