Skip to content

Commit 635d4b9

Browse files
author
CKI KWF Bot
committed
Merge: mptcp: phase-1 backports for RHEL-10.2
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-10/-/merge_requests/1560 JIRA: https://issues.redhat.com/browse/RHEL-115576 Upstream Status: all mainline in net.git Conflicts: only one, see individual patches Tested: boot-tested only Omitted-fix: e84cb86 ("mptcp: pm: in-kernel: C-flag: handle late ADD_ADDR") : the new test on 'C' flag can be flaky, we can stabilize it later in the developement phase Note: the upstream path manager has been re-worked so much that no present/future patch in that area would apply, unless all the re-work patches are backported. Therefore, the following series: - 71ca356 ("Merge branch 'mptcp-pm-code-reorganisation'") - 8121227 ("Merge branch 'mptcp-pm-misc-cleanups-part-2'") - 7842f3d ("Merge branch 'mptcp-pm-misc-cleanups-part-3'") plus some sparse "zero-impact" commits reworking the MPTCP path manager have been included. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Approved-by: Florian Westphal <fwestpha@redhat.com> Approved-by: Paolo Abeni <pabeni@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: CKI GitLab Kmaint Pipeline Bot <26919896-cki-kmaint-pipeline-bot@users.noreply.gitlab.com>
2 parents 25b0872 + f8de08c commit 635d4b9

31 files changed

+2959
-2435
lines changed

Documentation/netlink/specs/mptcp_pm.yaml

Lines changed: 25 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -22,65 +22,67 @@ definitions:
2222
doc: unused event
2323
-
2424
name: created
25-
doc:
26-
token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport
25+
doc: >-
2726
A new MPTCP connection has been created. It is the good time to
2827
allocate memory and send ADD_ADDR if needed. Depending on the
2928
traffic-patterns it can take a long time until the
3029
MPTCP_EVENT_ESTABLISHED is sent.
30+
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
31+
dport, server-side, [flags].
3132
-
3233
name: established
33-
doc:
34-
token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport
34+
doc: >-
3535
A MPTCP connection is established (can start new subflows).
36+
Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,
37+
dport, server-side, [flags].
3638
-
3739
name: closed
38-
doc:
39-
token
40+
doc: >-
4041
A MPTCP connection has stopped.
42+
Attribute: token.
4143
-
4244
name: announced
4345
value: 6
44-
doc:
45-
token, rem_id, family, daddr4 | daddr6 [, dport]
46+
doc: >-
4647
A new address has been announced by the peer.
48+
Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
4749
-
4850
name: removed
49-
doc:
50-
token, rem_id
51+
doc: >-
5152
An address has been lost by the peer.
53+
Attributes: token, rem_id.
5254
-
5355
name: sub-established
5456
value: 10
55-
doc:
56-
token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,
57-
dport, backup, if_idx [, error]
57+
doc: >-
5858
A new subflow has been established. 'error' should not be set.
59+
Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |
60+
daddr6, sport, dport, backup, if_idx [, error].
5961
-
6062
name: sub-closed
61-
doc:
62-
token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,
63-
dport, backup, if_idx [, error]
63+
doc: >-
6464
A subflow has been closed. An error (copy of sk_err) could be set if an
6565
error has been detected for this subflow.
66+
Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |
67+
daddr6, sport, dport, backup, if_idx [, error].
6668
-
6769
name: sub-priority
6870
value: 13
69-
doc:
70-
token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,
71-
dport, backup, if_idx [, error]
71+
doc: >-
7272
The priority of a subflow has changed. 'error' should not be set.
73+
Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |
74+
daddr6, sport, dport, backup, if_idx [, error].
7375
-
7476
name: listener-created
7577
value: 15
76-
doc:
77-
family, sport, saddr4 | saddr6
78+
doc: >-
7879
A new PM listener is created.
80+
Attributes: family, sport, saddr4 | saddr6.
7981
-
8082
name: listener-closed
81-
doc:
82-
family, sport, saddr4 | saddr6
83+
doc: >-
8384
A PM listener is closed.
85+
Attributes: family, sport, saddr4 | saddr6.
8486
8587
attribute-sets:
8688
-

Documentation/networking/mptcp-sysctl.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ add_addr_timeout - INTEGER (seconds)
1212
resent to an MPTCP peer that has not acknowledged a previous
1313
ADD_ADDR message.
1414

15+
Do not retransmit if set to 0.
16+
1517
The default value matches TCP_RTO_MAX. This is a per-namespace
1618
sysctl.
1719

@@ -108,3 +110,19 @@ stale_loss_cnt - INTEGER
108110
This is a per-namespace sysctl.
109111

110112
Default: 4
113+
114+
syn_retrans_before_tcp_fallback - INTEGER
115+
The number of SYN + MP_CAPABLE retransmissions before falling back to
116+
TCP, i.e. dropping the MPTCP options. In other words, if all the packets
117+
are dropped on the way, there will be:
118+
119+
* The initial SYN with MPTCP support
120+
* This number of SYN retransmitted with MPTCP support
121+
* The next SYN retransmissions will be without MPTCP support
122+
123+
0 means the first retransmission will be done without MPTCP options.
124+
>= 128 means that all SYN retransmissions will keep the MPTCP options. A
125+
lower number might increase false-positive MPTCP blackholes detections.
126+
This is a per-namespace sysctl.
127+
128+
Default: 2

include/net/mptcp.h

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -100,17 +100,9 @@ struct mptcp_out_options {
100100
#define MPTCP_SCHED_MAX 128
101101
#define MPTCP_SCHED_BUF_MAX (MPTCP_SCHED_NAME_MAX * MPTCP_SCHED_MAX)
102102

103-
#define MPTCP_SUBFLOWS_MAX 8
104-
105-
struct mptcp_sched_data {
106-
bool reinject;
107-
u8 subflows;
108-
struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
109-
};
110-
111103
struct mptcp_sched_ops {
112-
int (*get_subflow)(struct mptcp_sock *msk,
113-
struct mptcp_sched_data *data);
104+
int (*get_send)(struct mptcp_sock *msk);
105+
int (*get_retrans)(struct mptcp_sock *msk);
114106

115107
char name[MPTCP_SCHED_NAME_MAX];
116108
struct module *owner;

include/uapi/linux/mptcp.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@
3131
#define MPTCP_INFO_FLAG_FALLBACK _BITUL(0)
3232
#define MPTCP_INFO_FLAG_REMOTE_KEY_RECEIVED _BITUL(1)
3333

34+
#define MPTCP_PM_EV_FLAG_DENY_JOIN_ID0 _BITUL(0)
35+
3436
#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0)
3537
#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1)
3638
#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2)

include/uapi/linux/mptcp_pm.h

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -12,31 +12,33 @@
1212
/**
1313
* enum mptcp_event_type
1414
* @MPTCP_EVENT_UNSPEC: unused event
15-
* @MPTCP_EVENT_CREATED: token, family, saddr4 | saddr6, daddr4 | daddr6,
16-
* sport, dport A new MPTCP connection has been created. It is the good time
17-
* to allocate memory and send ADD_ADDR if needed. Depending on the
15+
* @MPTCP_EVENT_CREATED: A new MPTCP connection has been created. It is the
16+
* good time to allocate memory and send ADD_ADDR if needed. Depending on the
1817
* traffic-patterns it can take a long time until the MPTCP_EVENT_ESTABLISHED
19-
* is sent.
20-
* @MPTCP_EVENT_ESTABLISHED: token, family, saddr4 | saddr6, daddr4 | daddr6,
21-
* sport, dport A MPTCP connection is established (can start new subflows).
22-
* @MPTCP_EVENT_CLOSED: token A MPTCP connection has stopped.
23-
* @MPTCP_EVENT_ANNOUNCED: token, rem_id, family, daddr4 | daddr6 [, dport] A
24-
* new address has been announced by the peer.
25-
* @MPTCP_EVENT_REMOVED: token, rem_id An address has been lost by the peer.
26-
* @MPTCP_EVENT_SUB_ESTABLISHED: token, family, loc_id, rem_id, saddr4 |
27-
* saddr6, daddr4 | daddr6, sport, dport, backup, if_idx [, error] A new
28-
* subflow has been established. 'error' should not be set.
29-
* @MPTCP_EVENT_SUB_CLOSED: token, family, loc_id, rem_id, saddr4 | saddr6,
30-
* daddr4 | daddr6, sport, dport, backup, if_idx [, error] A subflow has been
31-
* closed. An error (copy of sk_err) could be set if an error has been
32-
* detected for this subflow.
33-
* @MPTCP_EVENT_SUB_PRIORITY: token, family, loc_id, rem_id, saddr4 | saddr6,
34-
* daddr4 | daddr6, sport, dport, backup, if_idx [, error] The priority of a
35-
* subflow has changed. 'error' should not be set.
36-
* @MPTCP_EVENT_LISTENER_CREATED: family, sport, saddr4 | saddr6 A new PM
37-
* listener is created.
38-
* @MPTCP_EVENT_LISTENER_CLOSED: family, sport, saddr4 | saddr6 A PM listener
39-
* is closed.
18+
* is sent. Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
19+
* sport, dport, server-side, [flags].
20+
* @MPTCP_EVENT_ESTABLISHED: A MPTCP connection is established (can start new
21+
* subflows). Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6,
22+
* sport, dport, server-side, [flags].
23+
* @MPTCP_EVENT_CLOSED: A MPTCP connection has stopped. Attribute: token.
24+
* @MPTCP_EVENT_ANNOUNCED: A new address has been announced by the peer.
25+
* Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].
26+
* @MPTCP_EVENT_REMOVED: An address has been lost by the peer. Attributes:
27+
* token, rem_id.
28+
* @MPTCP_EVENT_SUB_ESTABLISHED: A new subflow has been established. 'error'
29+
* should not be set. Attributes: token, family, loc_id, rem_id, saddr4 |
30+
* saddr6, daddr4 | daddr6, sport, dport, backup, if_idx [, error].
31+
* @MPTCP_EVENT_SUB_CLOSED: A subflow has been closed. An error (copy of
32+
* sk_err) could be set if an error has been detected for this subflow.
33+
* Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |
34+
* daddr6, sport, dport, backup, if_idx [, error].
35+
* @MPTCP_EVENT_SUB_PRIORITY: The priority of a subflow has changed. 'error'
36+
* should not be set. Attributes: token, family, loc_id, rem_id, saddr4 |
37+
* saddr6, daddr4 | daddr6, sport, dport, backup, if_idx [, error].
38+
* @MPTCP_EVENT_LISTENER_CREATED: A new PM listener is created. Attributes:
39+
* family, sport, saddr4 | saddr6.
40+
* @MPTCP_EVENT_LISTENER_CLOSED: A PM listener is closed. Attributes: family,
41+
* sport, saddr4 | saddr6.
4042
*/
4143
enum mptcp_event_type {
4244
MPTCP_EVENT_UNSPEC,

net/mptcp/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ obj-$(CONFIG_MPTCP) += mptcp.o
33

44
mptcp-y := protocol.o subflow.o options.o token.o crypto.o ctrl.o pm.o diag.o \
55
mib.o pm_netlink.o sockopt.o pm_userspace.o fastopen.o sched.o \
6-
mptcp_pm_gen.o
6+
mptcp_pm_gen.o pm_kernel.o
77

88
obj-$(CONFIG_SYN_COOKIES) += syncookies.o
99
obj-$(CONFIG_INET_MPTCP_DIAG) += mptcp_diag.o

net/mptcp/ctrl.c

Lines changed: 29 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ struct mptcp_pernet {
3232
unsigned int close_timeout;
3333
unsigned int stale_loss_cnt;
3434
atomic_t active_disable_times;
35+
u8 syn_retrans_before_tcp_fallback;
3536
unsigned long active_disable_stamp;
3637
u8 mptcp_enabled;
3738
u8 checksum_enabled;
@@ -92,6 +93,7 @@ static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet)
9293
pernet->mptcp_enabled = 0;
9394
pernet->add_addr_timeout = TCP_RTO_MAX;
9495
pernet->blackhole_timeout = 3600;
96+
pernet->syn_retrans_before_tcp_fallback = 2;
9597
atomic_set(&pernet->active_disable_times, 0);
9698
pernet->close_timeout = TCP_TIMEWAIT_LEN;
9799
pernet->checksum_enabled = 0;
@@ -245,6 +247,12 @@ static struct ctl_table mptcp_sysctl_table[] = {
245247
.proc_handler = proc_blackhole_detect_timeout,
246248
.extra1 = SYSCTL_ZERO,
247249
},
250+
{
251+
.procname = "syn_retrans_before_tcp_fallback",
252+
.maxlen = sizeof(u8),
253+
.mode = 0644,
254+
.proc_handler = proc_dou8vec_minmax,
255+
},
248256
};
249257

250258
static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
@@ -269,6 +277,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
269277
/* table[7] is for available_schedulers which is read-only info */
270278
table[8].data = &pernet->close_timeout;
271279
table[9].data = &pernet->blackhole_timeout;
280+
table[10].data = &pernet->syn_retrans_before_tcp_fallback;
272281

273282
hdr = register_net_sysctl_sz(net, MPTCP_SYSCTL_PATH, table,
274283
ARRAY_SIZE(mptcp_sysctl_table));
@@ -385,29 +394,39 @@ void mptcp_active_enable(struct sock *sk)
385394

386395
if (dst && dst->dev && (dst->dev->flags & IFF_LOOPBACK))
387396
atomic_set(&pernet->active_disable_times, 0);
397+
398+
dst_release(dst);
388399
}
389400
}
390401

391402
/* Check the number of retransmissions, and fallback to TCP if needed */
392403
void mptcp_active_detect_blackhole(struct sock *ssk, bool expired)
393404
{
394405
struct mptcp_subflow_context *subflow;
395-
u32 timeouts;
406+
u8 timeouts, to_max;
407+
struct net *net;
396408

397-
if (!sk_is_mptcp(ssk))
409+
/* Only check MPTCP SYN ... */
410+
if (likely(!sk_is_mptcp(ssk) || ssk->sk_state != TCP_SYN_SENT))
398411
return;
399412

400-
timeouts = inet_csk(ssk)->icsk_retransmits;
401413
subflow = mptcp_subflow_ctx(ssk);
402414

403-
if (subflow->request_mptcp && ssk->sk_state == TCP_SYN_SENT) {
404-
if (timeouts == 2 || (timeouts < 2 && expired)) {
405-
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPCAPABLEACTIVEDROP);
406-
subflow->mpc_drop = 1;
407-
mptcp_subflow_early_fallback(mptcp_sk(subflow->conn), subflow);
408-
}
409-
} else if (ssk->sk_state == TCP_SYN_SENT) {
415+
/* ... + MP_CAPABLE */
416+
if (!subflow->request_mptcp) {
417+
/* Mark as blackhole iif the 1st non-MPTCP SYN is accepted */
410418
subflow->mpc_drop = 0;
419+
return;
420+
}
421+
422+
net = sock_net(ssk);
423+
timeouts = inet_csk(ssk)->icsk_retransmits;
424+
to_max = mptcp_get_pernet(net)->syn_retrans_before_tcp_fallback;
425+
426+
if (timeouts == to_max || (timeouts < to_max && expired)) {
427+
MPTCP_INC_STATS(net, MPTCP_MIB_MPCAPABLEACTIVEDROP);
428+
subflow->mpc_drop = 1;
429+
mptcp_subflow_early_fallback(mptcp_sk(subflow->conn), subflow);
411430
}
412431
}
413432

net/mptcp/diag.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ static int subflow_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin
4747
flags |= MPTCP_SUBFLOW_FLAG_BKUP_REM;
4848
if (sf->request_bkup)
4949
flags |= MPTCP_SUBFLOW_FLAG_BKUP_LOC;
50-
if (sf->fully_established)
50+
if (READ_ONCE(sf->fully_established))
5151
flags |= MPTCP_SUBFLOW_FLAG_FULLY_ESTABLISHED;
5252
if (sf->conn_finished)
5353
flags |= MPTCP_SUBFLOW_FLAG_CONNECTED;

net/mptcp/options.c

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -458,7 +458,7 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
458458
return false;
459459

460460
/* MPC/MPJ needed only on 3rd ack packet, DATA_FIN and TCP shutdown take precedence */
461-
if (subflow->fully_established || snd_data_fin_enable ||
461+
if (READ_ONCE(subflow->fully_established) || snd_data_fin_enable ||
462462
subflow->snd_isn != TCP_SKB_CB(skb)->seq ||
463463
sk->sk_state != TCP_ESTABLISHED)
464464
return false;
@@ -935,7 +935,7 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
935935
/* here we can process OoO, in-window pkts, only in-sequence 4th ack
936936
* will make the subflow fully established
937937
*/
938-
if (likely(subflow->fully_established)) {
938+
if (likely(READ_ONCE(subflow->fully_established))) {
939939
/* on passive sockets, check for 3rd ack retransmission
940940
* note that msk is always set by subflow_syn_recv_sock()
941941
* for mp_join subflows
@@ -979,18 +979,19 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
979979
if (subflow->mp_join)
980980
goto reset;
981981
subflow->mp_capable = 0;
982+
if (!mptcp_try_fallback(ssk))
983+
goto reset;
982984
pr_fallback(msk);
983-
mptcp_do_fallback(ssk);
984985
return false;
985986
}
986987

987-
if (mp_opt->deny_join_id0)
988-
WRITE_ONCE(msk->pm.remote_deny_join_id0, true);
989-
990988
if (unlikely(!READ_ONCE(msk->pm.server_side)))
991989
pr_warn_once("bogus mpc option on established client sk");
992990

993991
set_fully_established:
992+
if (mp_opt->deny_join_id0)
993+
WRITE_ONCE(msk->pm.remote_deny_join_id0, true);
994+
994995
mptcp_data_lock((struct sock *)msk);
995996
__mptcp_subflow_fully_established(msk, subflow, mp_opt);
996997
mptcp_data_unlock((struct sock *)msk);
@@ -1117,7 +1118,9 @@ static bool add_addr_hmac_valid(struct mptcp_sock *msk,
11171118
return hmac == mp_opt->ahmac;
11181119
}
11191120

1120-
/* Return false if a subflow has been reset, else return true */
1121+
/* Return false in case of error (or subflow has been reset),
1122+
* else return true.
1123+
*/
11211124
bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
11221125
{
11231126
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
@@ -1221,7 +1224,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
12211224

12221225
mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
12231226
if (!mpext)
1224-
return true;
1227+
return false;
12251228

12261229
memset(mpext, 0, sizeof(*mpext));
12271230

0 commit comments

Comments
 (0)